opinionsreddit2026年4月23日上午05:40

我們對 18 個大型語言模型進行 OCR 基準測試（7000+ 次呼叫）— 便宜的舊模型經常勝出。完整資料集與框架已開源。

We benchmarked 18 LLMs on OCR (7k+ calls) — cheaper/old models oftentimes win. Full dataset + framework open-sourced.

TLDR; We were overpaying for OCR, so we compared flagship models with cheaper and older models. New mini-bench + leaderboard. Free tool to test your own documents. Open Source. We’ve been looking at OCR / document extraction workflows and kept seeing the same pattern: Too many teams are either stuck in legacy OCR pipelines, or are overpaying badly for LLM calls by defaulting to the newest/ biggest model. We put together a curated set of 42 standard documents and ran every model 10 times under

閱讀原文 →

相關報導

OpenAI 直播活動

ChatGPT 圖像 2.0 來了，生成圖片的能力大升級

「再等六個月就會變好」的說法只撐過一輪就破功了

Mistral Medium 3.5 在 AMD Strix Halo 上跑起來超慢，準備好熬夜吧