往下拉回到首頁
Gemma 4 31B 測試爆雷:複雜的設計規則反而讓 AI 生成的登陸頁面變更爛,還不如什麼都不說

Gemma 4 31B 測試爆雷:複雜的設計規則反而讓 AI 生成的登陸頁面變更爛,還不如什麼都不說

156 landing-page generations through Gemma 4 31B with 52 different system prompts. Rule-dense "design heuristics" prompts scored below the empty baseline. [R]

Setup: Gemma 4 31B Instruct via OpenRouter, temperature 0.7, 3 samples per persona, 156 generations total. Fixed task: one single-file HTML landing page for a fictional luxury-real-estate CRM. Eight required sections, inline styles only. Same user message to every persona. Why a small model specifically? If I ran this on Opus or GPT-5 the baseline would already be great in every condition and persona would be noise. Gemma at 31B leaves measurable headroom. 52 personas across 8 buckets: Empty b