往下拉回到首頁
MiniMax2.7 在終端機跑分爆冷門?用 Claude 寫 AI 代理的人都在用什麼模型

MiniMax2.7 在終端機跑分爆冷門?用 Claude 寫 AI 代理的人都在用什麼模型

MiniMax2.7 Local Results on Terminal Bench. Dud. Anyone using this for agent coding in Claude?

I just finished a full Terminal-Bench 2.0 run (445 trials) with MiniMax-M2.7 (Q8_0, unsloth GGUF) running locally on a Mac Studio M3 Ultra with 512GB unified memory. The result: 41.3% mean — which is actually worse than the 42.7% I got with M2.5 on the same hardware and config. The numbers: 434 trials, 184 solved, 250 failed 198 errors — 187 of those were AgentTimeoutError (the model running out of clock, not crashing) Mean reward: 0.413 10-17 tokens/second For comparison, M2.5 on the