
OpenAI 直播活動
OpenAI 將舉辦一場直播活動。在直播期間將揭露具體的公告、產品發布或示範內容。
上一次 OpenAI 突然搞直播,他們直接丟出 GPT-4 Turbo,然後一夜之間改掉所有定價

Hi all, I wanted to share a setup that’s working for me with Qwen3.6-35B-A3B on a laptop RTX 4060 (8GB VRAM) + 96GB RAM. This is not an interactive chat setup. I’m using it as a coding subagent inside an agentic pipeline, so some of the choices below are specific to that use case. TL;DR - Qwen3.6 35B A3B runs fine on 8GB VRAM + RAM as coding subagent - my real bug was not a crash: unlimited thinking consumed the whole max_tokens budget - disabling thinking fixed it - better fix: use per