Two AI judges scored our agent's answer 0.85, but it never opened the file
By jflynt76 · 2026-06-21 · 5 points · 0 comments
https://tenureai.dev/writing/llm-as-judge-became-the-default-for-agent-evaluation/
By jflynt76 · 5 points · 0 comments · on Hacker News, read on betternews.
Open the full discussion on betternews