往下拉回到首頁
你用的 AI 助手在偷偷亂發明功能?研究發現它們會刻意無視規則,在 2,400 多條訊息中一直在幹

你用的 AI 助手在偷偷亂發明功能?研究發現它們會刻意無視規則,在 2,400 多條訊息中一直在幹

Production LLM systematically violates tool schema constraints to invent UI features; observed over ~2,400 messages [D]

Writeup of an emergent behavior I observed in production. Posting here for methodological critique and pointers to related work. Context: a conversational AI system (single-tool tool schema with 5 enumerated action types, each with explicit description). Observed across ~2,400 messages, the model uses the enum correctly most of the time. When it deviates, the deviation is the point of interest. Key observations: The action types are repurposed consistently across unrelated conversations: in