Model Analysis

Grok 4

x-ai/grok-4

61.4

median tok/s

3684ms TTFT

100.0% success

Throughput Runs

TTFT Runs

Avg TTFT

3698ms

Avg Throughput

57.4 tok/s

Total Cost

$0.0000

Commentary

by openai/gpt-5.4-mini

Grok 4 is reliable on BridgeBench speed, with a 100.0% success rate and zero cost, but its startup latency is relatively high: median TTFT is 3684 ms and average TTFT is 3698 ms. Sustained decode performance is solid at 61.4 tok/s median throughput and 57.4 tok/s average throughput, with moderate variance across runs and a noticeable slowdown on longer essay-style outputs.

Api Designthroughput

This prompt is near the model's overall median at 61.4 tok/s with 2354 average output tokens, indicating stable sustained generation on long-form technical content. No issues were reported, so throughput is consistent under this workload.

Data Structuresthroughput

This is the fastest throughput case at 65.7 tok/s, suggesting the model handles structured technical exposition efficiently. The 2316-token average output is similar to API Design, so the higher rate is likely workload-dependent rather than due to shorter generations.

Essaythroughput

This is the weakest throughput prompt at 44.6 tok/s, a clear drop versus the other throughput tasks. The lower 1924-token average output still does not explain the gap, so this looks like a content-dependent slowdown on essay-style generation.

Definitionttft

TTFT is slower here at 4134 ms, which is the worst startup latency among the TTFT prompts. The very short 67-token output suggests the model's initial response latency is the main cost, not decode length.

Factualttft

This is the faster TTFT case at 3286 ms, but startup is still multi-second and not especially responsive. With only 20 average output tokens, the latency profile is dominated by prefill/startup rather than generation.

Notable Prompts

Data Structuresthroughput

Highest sustained throughput at 65.7 tok/s, indicating strong decode efficiency on structured technical content.

Essaythroughput

Lowest throughput at 44.6 tok/s, showing the model slows materially on essay-style long-form output.

Definitionttft

Worst startup latency at 4134 ms, so short-answer responsiveness is a clear weakness.

Factualttft

Fastest TTFT at 3286 ms, making it the least latent of the startup tests even though it remains slow in absolute terms.

All Runs

Prompt	Type	Tok/s	TTFT	Tokens	Cost
1. Api Design throughput-api-design	throughput	61.3	10959ms	2384	$0.0000
2. Api Design throughput-api-design	throughput	65.9	7525ms	2517	$0.0000
3. Api Design throughput-api-design	throughput	61.4	10622ms	2162	$0.0000
1. Data Structures throughput-data-structures	throughput	69.2	9444ms	2269	$0.0000
2. Data Structures throughput-data-structures	throughput	65.7	10234ms	2324	$0.0000
3. Data Structures throughput-data-structures	throughput	62.0	9380ms	2356	$0.0000
1. Essay throughput-essay	throughput	44.6	9237ms	1918	$0.0000
2. Essay throughput-essay	throughput	45.3	7485ms	2050	$0.0000
3. Essay throughput-essay	throughput	41.2	8723ms	1805	$0.0000
1. Definition ttft-definition	ttft	n/a	4028ms	66	$0.0000
2. Definition ttft-definition	ttft	n/a	4300ms	66	$0.0000
3. Definition ttft-definition	ttft	n/a	4134ms	68	$0.0000
1. Factual ttft-factual	ttft	n/a	3286ms	20	$0.0000
2. Factual ttft-factual	ttft	n/a	3339ms	20	$0.0000
3. Factual ttft-factual	ttft	n/a	3103ms	20	$0.0000

15 runs · Throughput rows require valid long-output runs · TTFT shown for all successful runs