Leaderboard

On-device LLM performance rankings powered by Glicko-2

Pixel 8

Android

Rank

#138

Rating

1,492

±14 RD

Win Rate

49.3%

Conservative Rating

1,464

TG Rating

1,483

PP Rating

1,549

Matches

1,344

Record

662W – 682L

Models Tested

ModelTG Median (tok/s)PP Median (tok/s)TG BestPP BestRuns
Thinker-SmolLM2-135M-Instruct-Reasoning.i1-Q4_K_M28.34211.7028.34211.701
gemma-3-1b-it-q4_0_s24.85141.2324.85141.231
LFM2.5-1.2B-Instruct-Q4_K_M22.2791.4322.2791.431
SmolLM2-360M-Instruct.i1-IQ4_XS19.95116.8119.95116.811
smollm2-360m-instruct-q8_019.05116.4519.66133.612
LFM2.5-1.2B-Thinking-Q4_K_M14.4362.6614.4362.661
gemma-3-270m-it-IQ4_NL12.36249.5012.36249.501
Dolphin3.0-Llama3.2-1B-Q4_K_M10.6032.6110.6032.611
SmolLM2-1.7B-Instruct-abliterated.i1-Q4_K_M10.4028.9210.4028.921
Dolphin3.0-Llama3.2-1B-Q8_010.1041.0110.1041.011
gemma-3-1b-it.Q8_010.0293.9610.0293.961
smollm2-1.7b-instruct-q4_k_m8.7220.438.7220.431
gemma-3-1b-it.Q5_K_M8.6641.298.6641.291
SmolLM2-1.7B-Instruct-abliterated.i1-IQ4_XS8.6419.708.6419.701
qwen2.5-1.5b-instruct-q8_08.4945.869.5954.063
DeepSeek-R1-Distill-Qwen-1.5B-Abliterated-dpo.i1-IQ4_XS8.3426.658.3426.651
DeepSeek-R1-Distill-Qwen-1.5B-Abliterated-dpo.Q4_K_M7.7524.307.7524.301
Qwen_Qwen3-0.6B-IQ4_XS7.4866.647.4866.641
DeepSeek-R1-Distill-Qwen-1.5B-uncensored.Q8_06.9954.546.9954.541
DeepSeek-R1-Distill-Qwen-1.5B-Abliterated-dpo.Q8_06.9232.656.9232.651
llama-3.2-1b-instruct-q8_06.6564.1512.8578.829
SmolLM2-1.7B-Instruct-Q8_06.0322.956.0322.951
qwen2.5-3b-instruct-q5_k_m5.5916.326.0016.543
Hermes-3-Llama-3.2-3B-abliterated.i1-Q4_K_M5.4812.505.4812.501
SmallThinker-3B-Preview-abliterated.i1-IQ4_XS5.4512.205.4512.201
Phi-3.5-mini-instruct.Q4_K_M5.0512.216.0413.743
gemma-2-2b-it-abliterated-Q4_K_M4.8914.724.8914.721
Llama-3.2-3B-Instruct-Q6_K4.7413.125.3715.166
gemma-2-2b-it-Q6_K3.7917.515.0622.105
Gemmasutra-Mini-2B-v1-Q6_K3.0811.744.7313.212
gemma-3-4b-it.Q5_K_M2.3110.272.3110.271

Head-to-Head Record

Performance by App Version

ImprovedRegressed

Compare With