BridgeBenchBridgeBench
Speed
Model Analysis

GLM-5

openrouter/z-ai/glm-5

73.3

median tok/s

962ms TTFT
100.0% success

Throughput Runs

9

TTFT Runs

6

Avg TTFT

1089ms

Avg Throughput

76.1 tok/s

Total Cost

$0.0728

Commentary

by openai/gpt-5.4-mini

GLM-5 is reliable on BridgeBench with a 100.0% success rate and low total cost of $0.072828. Sustained decode performance is solid at 73.3 tok/s median throughput (76.1 tok/s average), while startup latency is moderate at 962 ms median TTFT and 1089 ms average TTFT, with the factual TTFT workload pulling startup slower than the definition workload.

Api Designthroughput

This is the weakest sustained-throughput case at 67.2 tok/s median, likely reflecting the longer 2952-token generations and more complex instruction-following load. It is still stable with no failures, but decode speed drops below the model's overall median.

Data Structuresthroughput

Throughput is near the overall median at 73.3 tok/s, indicating consistent decode performance on medium-length technical output. No issues were recorded, so this prompt looks representative of the model's baseline speed.

Essaythroughput

This is the fastest sustained-throughput prompt at 77.0 tok/s median, helped by the shortest average output length among throughput tasks. The result suggests the model maintains or slightly improves decode rate when generation length is lower.

Definitionttft

Startup latency is strong here at 711 ms median TTFT, the best TTFT result in the set. The very short 54-token output likely reduces prefill and generation overhead, making this the model's fastest response-start case.

Factualttft

This is the slowest startup case at 1501 ms median TTFT despite only 12 average output tokens, which points to prompt-side prefill or retrieval-like overhead rather than decode cost. The gap versus ttft-definition is large enough to suggest TTFT is sensitive to prompt content, not just output length.

Notable Prompts

Api Designthroughput

Lowest sustained throughput in the benchmark at 67.2 tok/s, making it the clearest decode-speed regression point.

Essaythroughput

Highest sustained throughput at 77.0 tok/s, indicating the model can keep decode speed above its median on shorter generations.

Definitionttft

Best TTFT at 711 ms, so the model can start responding quickly when the prompt is compact.

Factualttft

Worst TTFT at 1501 ms, which is a notable startup penalty given the tiny output size.

All Runs

PromptTypeTok/sTTFT
1. Api Design
throughput-api-design
throughput82.1206ms
2. Api Design
throughput-api-design
throughput67.22706ms
3. Api Design
throughput-api-design
throughput45.5532ms
1. Data Structures
throughput-data-structures
throughput73.31426ms
2. Data Structures
throughput-data-structures
throughput77.01408ms
3. Data Structures
throughput-data-structures
throughput70.17581ms
1. Essay
throughput-essay
throughput57.66991ms
2. Essay
throughput-essay
throughput77.0838ms
3. Essay
throughput-essay
throughput135.4571ms
1. Definition
ttft-definition
ttftn/a1037ms
2. Definition
ttft-definition
ttftn/a711ms
3. Definition
ttft-definition
ttftn/a648ms
1. Factual
ttft-factual
ttftn/a1501ms
2. Factual
ttft-factual
ttftn/a886ms
3. Factual
ttft-factual
ttftn/a1752ms

15 runs · Throughput rows require valid long-output runs · TTFT shown for all successful runs