Large queue (avoid) -
- https://build.nvidia.com/z-ai/glm4_7
- https://build.nvidia.com/z-ai/glm5
- https://build.nvidia.com/qwen/qwen3.5-397b-a17b
- https://build.nvidia.com/minimaxai/minimax-m2.1
- https://build.nvidia.com/moonshotai/kimi-k2.5
Low queue (use) -
- https://build.nvidia.com/nvidia/nemotron-3-super-120b-a12b
- https://build.nvidia.com/qwen/qwen3.5-122b-a10b
- https://build.nvidia.com/minimaxai/minimax-m2.5
- https://build.nvidia.com/stepfun-ai/step-3.5-flash
- https://build.nvidia.com/openai/gpt-oss-120b
- https://build.nvidia.com/meta/llama-3_1-405b-instruct
- https://build.nvidia.com/moonshotai/kimi-k2-thinking
- https://build.nvidia.com/moonshotai/kimi-k2-instruct
Small but fast ones (use for easier tasks) -
- https://build.nvidia.com/openai/gpt-oss-20b
- https://build.nvidia.com/meta/llama-3_3-70b-instruct
- https://build.nvidia.com/meta/llama-3_1-70b-instruct
- https://build.nvidia.com/meta/llama-4-scout-17b-16e-instruct
- https://build.nvidia.com/qwen/qwen3-next-80b-a3b-thinking
- https://build.nvidia.com/qwen/qwen3-next-80b-a3b-instruct
Slow in practise -