Leaderboard
On-device LLM performance rankings powered by Glicko-2
iPhone 12 Pro Max
iOSRank
#36
Rating
1,863
±21 RD
Win Rate
85.1%
Conservative Rating
1,821
TG Rating
1,804
PP Rating
1,894
Matches
597
Record
508W – 89L
Models Tested
| Model | TG Median (tok/s) | PP Median (tok/s) | TG Best | PP Best | Runs |
|---|---|---|---|---|---|
| DeepSeek-R1-Distill-Qwen-1.5B-uncensored.f16 | 64.87 | 1341.00 | 64.87 | 1341.00 | 1 |
| granite-3.1-3b-a800m-instruct-Q4_K_M | 27.50 | 9.89 | 27.50 | 9.89 | 1 |
| gemma-3-1b-it.Q4_K_S | 25.40 | 38.41 | 25.40 | 38.41 | 1 |
| chatgpt-5-q8_0 | 24.80 | 480.65 | 42.32 | 657.80 | 2 |
| Qwen3-1.7B-Q4_K_M | 22.51 | 180.52 | 22.51 | 180.52 | 1 |
| llama-3.2-1b-instruct-q8_0 | 22.21 | 48.08 | 22.59 | 290.42 | 4 |
| qwen2.5-1.5b-instruct-q8_0 | 16.92 | 115.52 | 17.68 | 198.49 | 2 |
| gemma-3-1b-it.Q2_K | 16.18 | 28.59 | 16.18 | 28.59 | 1 |
| tinyswallow-1.5b-instruct-q5_k_m | 14.86 | 21.16 | 14.86 | 21.16 | 1 |
| gemma-2-2b-it.Q4_K_M | 14.01 | 126.11 | 14.01 | 126.11 | 1 |
| gemma-2-2b-it-Q6_K | 12.81 | 126.00 | 12.81 | 126.00 | 1 |
| DeepSeek-R1-Distill-Qwen-1.5B-IQ4_NL | 12.74 | 18.96 | 12.74 | 18.96 | 1 |
| qwen2.5-3b-instruct-q4_k_m | 12.67 | 90.13 | 12.67 | 90.13 | 1 |
| DeepSeek-R1-Distill-Qwen-1.5B-Q5_K_M | 12.59 | 20.04 | 12.59 | 20.04 | 1 |
| DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M | 12.25 | 16.82 | 12.25 | 16.82 | 1 |
| llama-3.2-3b-instruct-abliterated-q4_k_m | 7.01 | 62.78 | 7.01 | 62.78 | 1 |
| Phi-3.5-mini-instruct.Q4_K_M | 4.91 | 7.84 | 4.91 | 7.84 | 1 |
| Qwen3-1.7B-UD-Q4_K_XL | 4.70 | 93.54 | 4.70 | 93.54 | 1 |
| Llama-3.2-3B-Instruct-Q6_K | 4.64 | 8.10 | 4.64 | 8.10 | 1 |
| gemma-3-4B-it-QAT-Q4_0 | 4.25 | 6.42 | 4.25 | 6.42 | 1 |
| qwen2.5-3b-instruct-q5_k_m | 3.98 | 9.52 | 3.98 | 9.52 | 1 |
Head-to-Head Record
1–50 of 304 rows
1 / 7
Performance by App Version
ImprovedRegressed