Live
Speed
How fast can each AI coding model generate code? Throughput and TTFT measured separately for accurate, real-world numbers that matter for vibe coding and agentic coding workflows.
Updated 2026-03-27
Throughput
Sustained tok/s measured first-to-last token on 1,500+ token outputs.
TTFT
Time to first token on short prompts. How fast the model starts responding.
Methodology
Median across 3+ runs per prompt. Short and long outputs measured separately.
| Rank | Model | Throughput | TTFT | Cost |
|---|---|---|---|---|
| Grok 4.20 (Non-Reasoning) x-ai/grok-4.20 | 243.3t/s | 1999ms | $0.12 | |
| Grok 4.20 Reasoning x-ai/grok-4.20-reasoning | 237.7t/s | 1497ms | $0.12 | |
| Gemini 3.1 Pro google/gemini-3.1-pro-preview | 122.2t/s | 7608ms | $0.11 | |
| 4 | Claude Sonnet 4.6 anthropic/claude-sonnet-4-6 | 95.3t/s | 1207ms | $0.50 |
| 5 | Claude Opus 4.6 anthropic/claude-opus-4-6 | 92.2t/s | 1922ms | $2.80 |
| 6 | GPT-5.4 openai/gpt-5.4 | 88t/s | 397ms | $0.30 |
| 7 | GLM-5 openrouter/z-ai/glm-5 | 73.3t/s | 962ms | $0.07 |
| 8 | MiniMax M2.7 minimax/MiniMax-M2.7 | 68.2t/s | 6150ms | $0.07 |
| 9 | Grok 4 x-ai/grok-4 | 61.4t/s | 3684ms | $0.00 |
| 10 | MiMo-V2-Pro openrouter/xiaomi/mimo-v2-pro | 57.5t/s | 7791ms | $0.11 |
| 11 | GLM 5.1 z-ai/glm-5.1 | 44.3t/s | 2353ms | $0.08 |
11 modelstested · Ranked by median throughput · More models coming soon