VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO
By timhigins · 2026-06-23 · 17 points · 4 comments
https://arxiv.org/abs/2606.16140
By timhigins · 17 points · 4 comments · on Hacker News, read on BetterNews.
Open the full discussion on BetterNews