下に引いて戻る
DeepSeek 3.2 eating the opening think tag on llama.cpp server?

DeepSeek 3.2 eating the opening think tag on llama.cpp server?

DeepSeek 3.2 eating the opening think tag on llama.cpp server?

Hey guys. Having a weird issue with the new DeepSeek V3.2 Unsloth GGUF via llama-server. The model starts reasoning fine, but the actual opening think tag is missing from the output stream. I just see the plain text reasoning, and then the closing tag at the end. Because of this, Open WebUI doesn't collapse the thought block. Im on a 512GB box, command is just llama-server -m model_name -t 32 --flash-attn on. Tried toggling reasoning on/off, didn't help. Is the chat template broken in these sp