
OpenAI 直播活動
OpenAI 將舉辦一場直播活動。在直播期間將揭露具體的公告、產品發布或示範內容。
上一次 OpenAI 突然搞直播,他們直接丟出 GPT-4 Turbo,然後一夜之間改掉所有定價

I just finished a full Terminal-Bench 2.0 run (445 trials) with MiniMax-M2.7 (Q8_0, unsloth GGUF) running locally on a Mac Studio M3 Ultra with 512GB unified memory. The result: 41.3% mean — which is actually worse than the 42.7% I got with M2.5 on the same hardware and config. The numbers: 434 trials, 184 solved, 250 failed 198 errors — 187 of those were AgentTimeoutError (the model running out of clock, not crashing) Mean reward: 0.413 10-17 tokens/second For comparison, M2.5 on the