Live

Speed

How fast can each AI coding model generate code? Throughput and TTFT measured separately for accurate, real-world numbers that matter for vibe coding and agentic coding workflows.

Updated 2026-03-27

Throughput

Sustained tok/s measured first-to-last token on 1,500+ token outputs.

TTFT

Time to first token on short prompts. How fast the model starts responding.

Methodology

Median across 3+ runs per prompt. Short and long outputs measured separately.

Rank	Model	Throughput	TTFT	Cost
	Grok 4.20 (Non-Reasoning) x-ai/grok-4.20	243.3t/s	1999ms	$0.12
	Grok 4.20 Reasoning x-ai/grok-4.20-reasoning	237.7t/s	1497ms	$0.12
	Gemini 3.1 Pro google/gemini-3.1-pro-preview	122.2t/s	7608ms	$0.11
4	Claude Sonnet 4.6 anthropic/claude-sonnet-4-6	95.3t/s	1207ms	$0.50
5	Claude Opus 4.6 anthropic/claude-opus-4-6	92.2t/s	1922ms	$2.80
6	GPT-5.4 openai/gpt-5.4	88t/s	397ms	$0.30
7	GLM-5 openrouter/z-ai/glm-5	73.3t/s	962ms	$0.07
8	MiniMax M2.7 minimax/MiniMax-M2.7	68.2t/s	6150ms	$0.07
9	Grok 4 x-ai/grok-4	61.4t/s	3684ms	$0.00
10	MiMo-V2-Pro openrouter/xiaomi/mimo-v2-pro	57.5t/s	7791ms	$0.11
11	GLM 5.1 z-ai/glm-5.1	44.3t/s	2353ms	$0.08

11 modelstested · Ranked by median throughput · More models coming soon