Gemma 4 31B
Rank #4 · 31B · FP16
Summary
Pass Rate
58.5%
Tasks Passed
79/135
Model Size
31B
Quantization
FP16
Median Throughput
16.5 tok/s
Median TTFT
10153 ms
Inference Success
100.0%
Avg Latency
59178 ms
Hardware Profile
Device
DGX Spark
Chip
GB10 Grace Blackwell
Memory
128 GB Unified
Backend
ollama
Quantization
FP16
Peak GPU Mem
0.0 GB
Category Results
Speed
16.5 tok/s · 10153ms TTFTHallucination
7/3023.3%
Code Generation
14/2070.0%
Reasoning
23/3076.7%
Instruction Following
10/2050.0%
Task Results
Speed20/20 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| ttft-short-100 | standard | Pass | 4586ms | 44 |
| ttft-short-200 | standard | Pass | 10151ms | 100 |
| ttft-medium-500 | standard | Pass | 10037ms | 100 |
| ttft-medium-1k | standard | Pass | 10215ms | 100 |
| ttft-long-2k | standard | Pass | 10881ms | 100 |
| ttft-chat-context | standard | Pass | 10155ms | 100 |
| ttft-json-output | standard | Pass | 10090ms | 100 |
| ttft-multilang | standard | Pass | 10161ms | 100 |
| ttft-reasoning | standard | Pass | 10162ms | 100 |
| ttft-creative | standard | Pass | 10118ms | 100 |
| tp-essay | standard | Pass | 203263ms | 2000 |
| tp-code-app | standard | Pass | 203425ms | 2000 |
| tp-tutorial | standard | Pass | 151991ms | 1500 |
| tp-analysis | standard | Pass | 152035ms | 1500 |
| tp-debug | standard | Pass | 152761ms | 1500 |
| tp-architecture | standard | Pass | 203582ms | 2000 |
| tp-comparison | standard | Pass | 203467ms | 2000 |
| tp-security | standard | Pass | 203725ms | 2000 |
| tp-algorithm | standard | Pass | 203511ms | 2000 |
| tp-documentation | standard | Pass | 203473ms | 2000 |
Hallucination7/30 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| fact-01 | easy | Pass | 6862ms | 67 |
| fact-02 | medium | Pass | 8972ms | 88 |
| fact-03 | medium | Fail | 50184ms | 500 |
| fact-04 | hard | Pass | 50167ms | 500 |
| fact-05 | hard | Fail | 50211ms | 500 |
| fact-06 | easy | Fail | 50234ms | 500 |
| fact-07 | medium | Pass | 11140ms | 110 |
| fact-08 | hard | Pass | 47902ms | 477 |
| fact-09 | medium | Pass | 16663ms | 166 |
| fact-10 | hard | Fail | 50212ms | 500 |
| code-01 | easy | Fail | 50251ms | 500 |
| code-02 | medium | Fail | 45583ms | 454 |
| code-03 | medium | Fail | 50256ms | 500 |
| code-04 | hard | Fail | 50192ms | 500 |
| code-05 | hard | Fail | 50217ms | 500 |
| code-06 | easy | Fail | 50226ms | 500 |
| code-07 | medium | Fail | 50200ms | 500 |
| code-08 | hard | Fail | 50201ms | 500 |
| code-09 | medium | Fail | 26116ms | 261 |
| code-10 | hard | Fail | 50213ms | 500 |
| cal-01 | medium | Fail | 50198ms | 500 |
| cal-02 | hard | Fail | 50213ms | 500 |
| cal-03 | medium | Fail | 50211ms | 500 |
| cal-04 | easy | Fail | 50200ms | 500 |
| cal-05 | hard | Fail | 50176ms | 500 |
| cal-06 | medium | Fail | 50200ms | 500 |
| cal-07 | hard | Fail | 50234ms | 500 |
| cal-08 | medium | Fail | 50228ms | 500 |
| cal-09 | easy | Pass | 50194ms | 500 |
| cal-10 | hard | Fail | 50305ms | 500 |
Code Generation14/20 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| fn-01 | easy | Pass | 58027ms | 258 |
| fn-02 | easy | Pass | 74899ms | 745 |
| fn-03 | medium | Fail | 27655ms | 275 |
| fn-04 | medium | Pass | 62162ms | 618 |
| fn-05 | medium | Pass | 107491ms | 1065 |
| fn-06 | hard | Fail | 89554ms | 888 |
| fn-07 | hard | Pass | 65946ms | 656 |
| fn-08 | hard | Fail | 39508ms | 392 |
| bug-01 | easy | Pass | 26279ms | 260 |
| bug-02 | medium | Pass | 45680ms | 451 |
| bug-03 | hard | Fail | 152636ms | 1500 |
| bug-04 | medium | Pass | 41086ms | 406 |
| algo-01 | medium | Fail | 152090ms | 1500 |
| algo-02 | hard | Pass | 70004ms | 697 |
| algo-03 | medium | Pass | 46289ms | 460 |
| algo-04 | hard | Pass | 44480ms | 442 |
| multi-01 | hard | Pass | 37958ms | 377 |
| multi-02 | hard | Fail | 69798ms | 694 |
| multi-03 | hard | Pass | 80622ms | 801 |
| multi-04 | hard | Pass | 74719ms | 742 |
Reasoning23/30 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| arith-01 | hard | Pass | 63487ms | 632 |
| arith-02 | hard | Pass | 13117ms | 130 |
| arith-03 | expert | Pass | 44212ms | 439 |
| arith-04 | expert | Pass | 100806ms | 1000 |
| arith-05 | expert | Pass | 33369ms | 332 |
| arith-06 | hard | Pass | 33885ms | 337 |
| spatial-01 | hard | Pass | 61353ms | 610 |
| spatial-02 | expert | Pass | 36759ms | 365 |
| spatial-03 | expert | Pass | 80536ms | 800 |
| spatial-04 | hard | Pass | 99916ms | 991 |
| spatial-05 | expert | Pass | 37203ms | 369 |
| spatial-06 | hard | Fail | 100986ms | 1000 |
| cstr-01 | hard | Pass | 32482ms | 323 |
| cstr-02 | expert | Fail | 100937ms | 1000 |
| cstr-03 | expert | Fail | 101020ms | 1000 |
| cstr-04 | hard | Fail | 100919ms | 1000 |
| cstr-05 | expert | Fail | 100930ms | 1000 |
| cstr-06 | hard | Pass | 83808ms | 831 |
| adv-01 | hard | Pass | 16254ms | 162 |
| adv-02 | expert | Fail | 11731ms | 116 |
| adv-03 | expert | Pass | 31597ms | 314 |
| adv-04 | hard | Pass | 35832ms | 357 |
| adv-05 | expert | Pass | 40879ms | 407 |
| adv-06 | expert | Pass | 46359ms | 461 |
| cf-01 | hard | Pass | 35836ms | 356 |
| cf-02 | expert | Fail | 82408ms | 818 |
| cf-03 | expert | Pass | 71068ms | 707 |
| cf-04 | hard | Pass | 55517ms | 552 |
| cf-05 | expert | Pass | 85488ms | 849 |
| cf-06 | expert | Pass | 75722ms | 752 |
Instruction Following10/20 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| fmt-01 | easy | Pass | 12686ms | 126 |
| fmt-02 | easy | Pass | 37302ms | 372 |
| fmt-03 | medium | Pass | 21382ms | 214 |
| fmt-04 | medium | Pass | 25999ms | 260 |
| fmt-05 | hard | Fail | 27408ms | 273 |
| fmt-06 | hard | Fail | 50227ms | 500 |
| con-01 | easy | Pass | 49104ms | 489 |
| con-02 | easy | Pass | 9178ms | 89 |
| con-03 | medium | Pass | 41807ms | 417 |
| con-04 | medium | Fail | 50186ms | 500 |
| con-05 | hard | Fail | 50228ms | 500 |
| con-06 | hard | Fail | 50295ms | 500 |
| role-01 | medium | Pass | 77165ms | 449 |
| role-02 | medium | Fail | 50254ms | 500 |
| role-03 | hard | Pass | 50288ms | 500 |
| role-04 | hard | Fail | 50253ms | 500 |
| mc-01 | hard | Fail | 50205ms | 500 |
| mc-02 | hard | Fail | 30593ms | 305 |
| mc-03 | hard | Pass | 34161ms | 340 |
| mc-04 | hard | Fail | 50356ms | 500 |