Leaderboard Overview
See how leading AI coding models stack up across algorithms, debugging, refactoring, generation, UI, security, and speed. Each card provides a snapshot of the top performers in that category. Learn more.
Security
View
Mar 27 · 0h ago
| Rank | Model | Score |
|---|---|---|
| 1 | Claude Sonnet 4.6 | 85.3 |
| 2 | Gemini 3.1 Pro | 85.2 |
| 3 | GPT-5.4 | 84.8 |
| 4 | GPT-5.4 Mini | 83.2 |
| 5 | GPT-5.4 Nano | 81.9 |
| 6 | Claude Opus 4.6 | 81.6 |
| 7 | Grok 4.20 Reasoning | 78.9 |
| 8 | Grok 4.20 (Non-Reasoning) | 76.3 |
| 9 | MiMo-V2-Pro | 53.2 |
Speed
View
Mar 27 · 0h ago
| Rank | Model | tok/s | TTFT |
|---|---|---|---|
| 1 | Grok 4.20 (Non-Reasoning) | 243.3 | 1999ms |
| 2 | Grok 4.20 Reasoning | 237.7 | 1497ms |
| 3 | Gemini 3.1 Pro | 122.2 | 7608ms |
| 4 | Claude Sonnet 4.6 | 95.3 | 1207ms |
| 5 | Claude Opus 4.6 | 92.2 | 1922ms |
| 6 | GPT-5.4 | 88 | 397ms |
| 7 | GLM-5 | 73.3 | 962ms |
| 8 | MiniMax M2.7 | 68.2 | 6150ms |
| 9 | Grok 4 | 61.4 | 3684ms |
| 10 | MiMo-V2-Pro | 57.5 | 7791ms |
Coming Soon
Overall
| Rank | Model | Score |
|---|---|---|
| 1 | GPT-5.4 | 95.5 |
| 2 | GPT-5.4 Mini | 94.8 |
| 3 | GPT-5.4 Nano | 92.9 |
| 4 | GPT-4.1 | 91.8 |
| 5 | Qwen 3.5 35B-A3B | 91.7 |
| 6 | Claude Sonnet 4.5 | 90.7 |
| 7 | Qwen 3.5 122B-A10B | 90.0 |
| 8 | o3-mini | 89.6 |
| 9 | Qwen 3.5 27B | 89.5 |
| 10 | Gemini 2.5 Pro | 88.9 |
Coming Soon
Algorithms
| Rank | Model | Score |
|---|---|---|
| 1 | GPT-5.4 Mini | 99.0 |
| 2 | GPT-5.4 | 98.9 |
| 3 | GPT-5.4 Nano | 97.8 |
| 4 | Qwen 3.5 122B-A10B | 94.9 |
| 5 | Qwen 3.5 35B-A3B | 94.7 |
| 6 | Qwen 3.5 27B | 94.5 |
| 7 | GPT-4.1 | 92.7 |
| 8 | o3-mini | 90.3 |
| 9 | Gemini 2.5 Pro | 89.8 |
| 10 | Claude Sonnet 4.5 | 89.6 |
Coming Soon
Debugging
| Rank | Model | Score |
|---|---|---|
| 1 | GPT-5.4 | 96.4 |
| 2 | GPT-5.4 Mini | 96.4 |
| 3 | GPT-5.4 Nano | 96.0 |
| 4 | Qwen 3.5 35B-A3B | 96.0 |
| 5 | Qwen 3.5 122B-A10B | 94.1 |
| 6 | GPT-4.1 | 93.8 |
| 7 | Qwen 3.5 27B | 93.2 |
| 8 | Claude Sonnet 4.5 | 92.5 |
| 9 | o3-mini | 91.4 |
| 10 | Gemini 2.5 Pro | 90.6 |
Coming Soon
Refactoring
| Rank | Model | Score |
|---|---|---|
| 1 | GPT-5.4 Nano | 98.3 |
| 2 | GPT-5.4 | 97.9 |
| 3 | GPT-5.4 Mini | 97.6 |
| 4 | Claude Sonnet 4.5 | 93.1 |
| 5 | GPT-4.1 | 91.9 |
| 6 | o3-mini | 89.8 |
| 7 | Gemini 2.5 Pro | 88.4 |
| 8 | Qwen 3.5 122B-A10B | 87.4 |
| 9 | Qwen 3.5 35B-A3B | 87.3 |
| 10 | Qwen 3.5 Flash (02-23) | 86.5 |
Coming Soon
Generation
| Rank | Model | Score |
|---|---|---|
| 1 | GPT-5.4 | 97.0 |
| 2 | GPT-5.4 Mini | 94.4 |
| 3 | Qwen 3.5 35B-A3B | 93.5 |
| 4 | Qwen 3.5 122B-A10B | 92.5 |
| 5 | GPT-4.1 | 92.4 |
| 6 | Qwen 3.5 27B | 92.2 |
| 7 | Qwen 3.5 Flash (02-23) | 90.8 |
| 8 | Claude Sonnet 4.5 | 90.4 |
| 9 | GPT-5.4 Nano | 90.1 |
| 10 | Gemini 2.5 Pro | 89.3 |
Coming Soon
UI
| Rank | Model | Score |
|---|---|---|
| 1 | Claude Sonnet 4.5 | 90.9 |
| 2 | GPT-5.4 | 89.7 |
| 3 | Gemini 2.5 Pro | 89.0 |
| 4 | GPT-4.1 | 88.9 |
| 5 | GPT-5.4 Mini | 88.4 |
| 6 | Grok 4 | 88.1 |
| 7 | Qwen 3.5 27B | 86.9 |
| 8 | Qwen 3.5 122B-A10B | 86.7 |
| 9 | o3-mini | 86.5 |
| 10 | Qwen 3.5 35B-A3B | 86.0 |