Grok 4.20 (Non-Reasoning)
x-ai/grok-4.20
243.3
median tok/s
Throughput Runs
9
TTFT Runs
6
Avg TTFT
1830ms
Avg Throughput
221.1 tok/s
Total Cost
$0.1248
Commentary
by openai/gpt-5.4-miniGrok 4.20 (Non-Reasoning) is reliable on BridgeBench speed, with a 100.0% success rate and no prompt-level failures, but startup latency is fairly high: TTFT averages 1830 ms and the median is 1999 ms. Sustained decode performance is strong overall at 221.1 tok/s average and 243.3 tok/s median, with cost staying low at $0.124758, though throughput drops materially on longer essay-style outputs.
This is the strongest throughput case, with a 260.9 tok/s median on ~2549 output tokens and no issues. The model sustains high decode speed on long, structured technical output without instability.
Performance is solid and close to the overall median at 243.3 tok/s on ~2079 output tokens. This suggests stable sustained generation under moderate-length technical prompts.
This is the main throughput weakness, falling to 155.7 tok/s median on ~1903 output tokens. The drop indicates the model slows significantly on essay-style generation, likely due to longer-form reasoning and less structured output.
TTFT is slightly better here at 1932 ms median, but still near 2 seconds even for a short 74-token response. Startup latency remains the main bottleneck rather than decode speed.
This is the slowest startup case at 2066 ms median TTFT despite only 15 output tokens. The short completion length makes the latency overhead especially visible and suggests weak first-token responsiveness.
Notable Prompts
Highest sustained throughput at 260.9 tok/s with no issues, indicating strong long-form decode capacity.
Throughput drops to 155.7 tok/s, the largest degradation across prompt types.
2066 ms TTFT on a 15-token output is a poor startup profile and dominates end-to-end latency.
Near-median throughput with no issues suggests good consistency on typical technical workloads.
All Runs
| Prompt | Type | Tok/s | TTFT | Tokens | Cost | |
|---|---|---|---|---|---|---|
1. Api Design throughput-api-design | throughput | 285.7 | 15272ms | 2666 | $0.0164 | |
2. Api Design throughput-api-design | throughput | 260.9 | 12835ms | 2584 | $0.0159 | |
3. Api Design throughput-api-design | throughput | 247.6 | 12544ms | 2396 | $0.0148 | |
1. Data Structures throughput-data-structures | throughput | 239.7 | 12199ms | 2278 | $0.0141 | |
2. Data Structures throughput-data-structures | throughput | 243.3 | 14971ms | 1928 | $0.0120 | |
3. Data Structures throughput-data-structures | throughput | 244.4 | 14677ms | 2031 | $0.0127 | |
1. Essay throughput-essay | throughput | 155.7 | 20704ms | 1883 | $0.0117 | |
2. Essay throughput-essay | throughput | 163.5 | 28055ms | 1875 | $0.0117 | |
3. Essay throughput-essay | throughput | 148.9 | 17930ms | 1951 | $0.0121 | |
1. Definition ttft-definition | ttft | n/a | 1501ms | 68 | $0.0007 | |
2. Definition ttft-definition | ttft | n/a | 2150ms | 77 | $0.0007 | |
3. Definition ttft-definition | ttft | n/a | 1932ms | 77 | $0.0007 | |
1. Factual ttft-factual | ttft | n/a | 2268ms | 17 | $0.0004 | |
2. Factual ttft-factual | ttft | n/a | 1061ms | 11 | $0.0003 | |
3. Factual ttft-factual | ttft | n/a | 2066ms | 17 | $0.0004 |
15 runs · Throughput rows require valid long-output runs · TTFT shown for all successful runs