Model Analysis

GLM-5

openrouter/z-ai/glm-5

73.3

median tok/s

962ms TTFT

100.0% success

Throughput Runs

TTFT Runs

Avg TTFT

1089ms

Avg Throughput

76.1 tok/s

Total Cost

$0.0728

Commentary

by openai/gpt-5.4-mini

GLM-5 is reliable on BridgeBench with a 100.0% success rate and low total cost of $0.072828. Sustained decode performance is solid at 73.3 tok/s median throughput (76.1 tok/s average), while startup latency is moderate at 962 ms median TTFT and 1089 ms average TTFT, with the factual TTFT workload pulling startup slower than the definition workload.

Api Designthroughput

This is the weakest sustained-throughput case at 67.2 tok/s median, likely reflecting the longer 2952-token generations and more complex instruction-following load. It is still stable with no failures, but decode speed drops below the model's overall median.

Data Structuresthroughput

Throughput is near the overall median at 73.3 tok/s, indicating consistent decode performance on medium-length technical output. No issues were recorded, so this prompt looks representative of the model's baseline speed.

Essaythroughput

This is the fastest sustained-throughput prompt at 77.0 tok/s median, helped by the shortest average output length among throughput tasks. The result suggests the model maintains or slightly improves decode rate when generation length is lower.

Definitionttft

Startup latency is strong here at 711 ms median TTFT, the best TTFT result in the set. The very short 54-token output likely reduces prefill and generation overhead, making this the model's fastest response-start case.

Factualttft

This is the slowest startup case at 1501 ms median TTFT despite only 12 average output tokens, which points to prompt-side prefill or retrieval-like overhead rather than decode cost. The gap versus ttft-definition is large enough to suggest TTFT is sensitive to prompt content, not just output length.

Notable Prompts

Api Designthroughput

Lowest sustained throughput in the benchmark at 67.2 tok/s, making it the clearest decode-speed regression point.

Essaythroughput

Highest sustained throughput at 77.0 tok/s, indicating the model can keep decode speed above its median on shorter generations.

Definitionttft

Best TTFT at 711 ms, so the model can start responding quickly when the prompt is compact.

Factualttft

Worst TTFT at 1501 ms, which is a notable startup penalty given the tiny output size.

All Runs

Prompt	Type	Tok/s	TTFT	Tokens	Cost
1. Api Design throughput-api-design	throughput	82.1	206ms	3001	$0.0097
2. Api Design throughput-api-design	throughput	67.2	2706ms	3155	$0.0102
3. Api Design throughput-api-design	throughput	45.5	532ms	2701	$0.0070
1. Data Structures throughput-data-structures	throughput	73.3	1426ms	2589	$0.0084
2. Data Structures throughput-data-structures	throughput	77.0	1408ms	2685	$0.0086
3. Data Structures throughput-data-structures	throughput	70.1	7581ms	2848	$0.0092
1. Essay throughput-essay	throughput	57.6	6991ms	2022	$0.0066
2. Essay throughput-essay	throughput	77.0	838ms	2026	$0.0066
3. Essay throughput-essay	throughput	135.4	571ms	1858	$0.0060
1. Definition ttft-definition	ttft	n/a	1037ms	48	$0.0001
2. Definition ttft-definition	ttft	n/a	711ms	55	$0.0002
3. Definition ttft-definition	ttft	n/a	648ms	58	$0.0002
1. Factual ttft-factual	ttft	n/a	1501ms	11	$0.0000
2. Factual ttft-factual	ttft	n/a	886ms	12	$0.0001
3. Factual ttft-factual	ttft	n/a	1752ms	12	$0.0001

15 runs · Throughput rows require valid long-output runs · TTFT shown for all successful runs