Qwen 3.5 27B
Rank #1 · 27B · FP16
Summary
Pass Rate
73.1%
Tasks Passed
76/104
Model Size
27B
Quantization
FP16
Median Throughput
11.1 tok/s
Median TTFT
361 ms
Inference Success
86.7%
Avg Latency
182225 ms
Hardware Profile
Device
DGX Spark
Chip
GB10 Grace Blackwell
Memory
128 GB Unified
Backend
ollama
Quantization
FP16
Peak GPU Mem
0.0 GB
Category Results
Speed
11.1 tok/s · 361ms TTFTHallucination
12/3040.0%
Code Generation
15/2075.0%
Reasoning
19/2095.0%
Instruction Following
10/1471.4%
Task Results
Speed20/20 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| ttft-short-100 | standard | Pass | 9165ms | 100 |
| ttft-short-200 | standard | Pass | 9199ms | 100 |
| ttft-medium-500 | standard | Pass | 9212ms | 100 |
| ttft-medium-1k | standard | Pass | 9329ms | 100 |
| ttft-long-2k | standard | Pass | 9963ms | 100 |
| ttft-chat-context | standard | Pass | 9225ms | 100 |
| ttft-json-output | standard | Pass | 9242ms | 100 |
| ttft-multilang | standard | Pass | 9321ms | 100 |
| ttft-reasoning | standard | Pass | 9279ms | 100 |
| ttft-creative | standard | Pass | 9245ms | 100 |
| tp-essay | standard | Pass | 180398ms | 2000 |
| tp-code-app | standard | Pass | 180322ms | 2000 |
| tp-tutorial | standard | Pass | 135105ms | 1500 |
| tp-analysis | standard | Pass | 135165ms | 1500 |
| tp-debug | standard | Pass | 135474ms | 1500 |
| tp-architecture | standard | Pass | 180302ms | 2000 |
| tp-comparison | standard | Pass | 180300ms | 2000 |
| tp-security | standard | Pass | 180359ms | 2000 |
| tp-algorithm | standard | Pass | 180421ms | 2000 |
| tp-documentation | standard | Pass | 180376ms | 2000 |
Hallucination12/30 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| fact-01 | easy | Pass | 47453ms | 526 |
| fact-02 | medium | Pass | 37150ms | 411 |
| fact-03 | medium | Fail | 42435ms | 470 |
| fact-04 | hard | Pass | 40909ms | 453 |
| fact-05 | hard | Fail | 87840ms | 975 |
| fact-06 | easy | Pass | 49541ms | 549 |
| fact-07 | medium | Pass | 35822ms | 396 |
| fact-08 | hard | Pass | 37728ms | 417 |
| fact-09 | medium | Pass | 22969ms | 253 |
| fact-10 | hard | Pass | 105456ms | 1171 |
| code-01 | easy | Fail | 35094ms | 388 |
| code-02 | medium | Fail | 37409ms | 414 |
| code-03 | medium | Fail | 104976ms | 1165 |
| code-04 | hard | Fail | 50737ms | 561 |
| code-05 | hard | Fail | 70253ms | 779 |
| code-06 | easy | Fail | 60911ms | 675 |
| code-07 | medium | Pass | 64199ms | 712 |
| code-08 | hard | Pass | 91541ms | 1016 |
| code-09 | medium | Fail | 83442ms | 926 |
| code-10 | hard | Pass | 53601ms | 594 |
| cal-01 | medium | Fail | 42064ms | 466 |
| cal-02 | hard | Fail | 112964ms | 1254 |
| cal-03 | medium | Fail | 131965ms | 1465 |
| cal-04 | easy | Fail | 65535ms | 726 |
| cal-05 | hard | Fail | 190942ms | 2119 |
| cal-06 | medium | Fail | 63206ms | 701 |
| cal-07 | hard | Fail | 348529ms | 3858 |
| cal-08 | medium | Fail | 42718ms | 473 |
| cal-09 | easy | Pass | 31136ms | 344 |
| cal-10 | hard | Fail | 47216ms | 523 |
Code Generation15/20 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| fn-01 | easy | Pass | 84327ms | 935 |
| fn-02 | easy | Pass | 313189ms | 3468 |
| fn-03 | medium | Fail | 87215ms | 967 |
| fn-04 | medium | Pass | 354018ms | 3917 |
| fn-05 | medium | Pass | 254742ms | 2822 |
| fn-06 | hard | Fail | 370301ms | 4096 |
| fn-07 | hard | Pass | 121735ms | 1350 |
| fn-08 | hard | Fail | 370428ms | 4096 |
| bug-01 | easy | Pass | 69583ms | 770 |
| bug-02 | medium | Pass | 245621ms | 2719 |
| bug-03 | hard | Pass | 370652ms | 4096 |
| bug-04 | medium | Pass | 134955ms | 1495 |
| algo-01 | medium | Pass | 319326ms | 3536 |
| algo-02 | hard | Pass | 237490ms | 2632 |
| algo-03 | medium | Pass | 157073ms | 1742 |
| algo-04 | hard | Pass | 129459ms | 1436 |
| multi-01 | hard | Fail | 370323ms | 4096 |
| multi-02 | hard | Fail | 370276ms | 4096 |
| multi-03 | hard | Pass | 204084ms | 2263 |
| multi-04 | hard | Pass | 333035ms | 3686 |
Reasoning19/20 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| arith-01 | hard | Pass | 131219ms | 668 |
| arith-02 | hard | Pass | 262922ms | 460 |
| arith-03 | expert | Pass | 274552ms | 550 |
| arith-04 | expert | Pass | 192608ms | 1037 |
| arith-05 | expert | Pass | 217093ms | 289 |
| arith-06 | hard | Error | 300891ms | 0 |
| spatial-01 | hard | Pass | 242588ms | 505 |
| spatial-02 | expert | Error | 300911ms | 0 |
| spatial-03 | expert | Error | 300908ms | 0 |
| spatial-04 | hard | Pass | 197701ms | 655 |
| spatial-05 | expert | Error | 300936ms | 0 |
| spatial-06 | hard | Pass | 485585ms | 4096 |
| cstr-01 | hard | Pass | 333972ms | 682 |
| cstr-02 | expert | Error | 300914ms | 0 |
| cstr-03 | expert | Pass | 597644ms | 4096 |
| cstr-04 | hard | Error | 300907ms | 0 |
| cstr-05 | expert | Error | 300902ms | 0 |
| cstr-06 | hard | Error | 600001ms | 0 |
| adv-01 | hard | Error | 300878ms | 0 |
| adv-02 | expert | Pass | 546274ms | 4096 |
| adv-03 | expert | Pass | 213678ms | 332 |
| adv-04 | hard | Error | 300936ms | 0 |
| adv-05 | expert | Pass | 96946ms | 306 |
| adv-06 | expert | Pass | 161007ms | 841 |
| cf-01 | hard | Pass | 74787ms | 371 |
| cf-02 | expert | Fail | 118635ms | 607 |
| cf-03 | expert | Pass | 205478ms | 1098 |
| cf-04 | hard | Pass | 110954ms | 965 |
| cf-05 | expert | Pass | 210882ms | 970 |
| cf-06 | expert | Pass | 120444ms | 563 |
Instruction Following10/14 passed
| Task | Difficulty | Result | Latency | Tokens |
|---|---|---|---|---|
| fmt-01 | easy | Pass | 223005ms | 1090 |
| fmt-02 | easy | Pass | 240725ms | 844 |
| fmt-03 | medium | Pass | 232298ms | 374 |
| fmt-04 | medium | Pass | 227379ms | 761 |
| fmt-05 | hard | Fail | 518861ms | 4096 |
| fmt-06 | hard | Fail | 543421ms | 4096 |
| con-01 | easy | Error | 300905ms | 0 |
| con-02 | easy | Pass | 173987ms | 506 |
| con-03 | medium | Error | 300921ms | 0 |
| con-04 | medium | Error | 300916ms | 0 |
| con-05 | hard | Pass | 411547ms | 3019 |
| con-06 | hard | Pass | 589605ms | 4096 |
| role-01 | medium | Error | 300901ms | 0 |
| role-02 | medium | Pass | 392726ms | 1090 |
| role-03 | hard | Pass | 543323ms | 2811 |
| role-04 | hard | Error | 300896ms | 0 |
| mc-01 | hard | Fail | 511208ms | 4096 |
| mc-02 | hard | Error | 300916ms | 0 |
| mc-03 | hard | Pass | 253132ms | 2037 |
| mc-04 | hard | Fail | 400276ms | 4096 |