opinionsredditApril 20, 2026 at 09:01 PM

I benchmarked 21 local LLMs on a MacBook Air M5 for code quality AND speed

我在 MacBook Air M5 上對 21 個本地大語言模型進行了程式碼品質和速度的基準測試

There are plenty of "bro trust me, this model is better for coding" discussions out there. I wanted to replace the vibes with actual data: which model writes correct code and how fast does it run on real hardware, tested under identical conditions so the results are directly comparable. No cherry-picked prompts, no subjective impressions, just pass@1 on 164 coding problems with an expanded test suite. Hardware: MacBook Air M5, 32 GB unified memory Quantization: Q4_K_M for all models via llam

Read original →

Related Articles

OpenAI Livestream

ChatGPT Images 2.0

The "just wait 6 months" argument from 2025 survived exactly one iteration

Mistral Medium 3.5 on AMD Strix Halo: Painfully Slow (Plan for Overnight Runs)