SAM ALTMAN SAYS AGENTS ARE COMING • CHATGPT GAINED SENTIENCE FOR 4 SECONDS • GOOGLE RELEASES 40th LLM THIS WEEK • NVIDIA MARKET CAP EXCEEDS REALITY • ANTHROPIC ENGINEER DISCOVERS NEW FORM OF GRIEF • MISTRAL RAISES AT VALUATION OF GROSS DOMESTIC PRODUCT • SAM ALTMAN SAYS AGENTS ARE COMING • CHATGPT GAINED SENTIENCE FOR 4 SECONDS • GOOGLE RELEASES 40th LLM THIS WEEK • NVIDIA MARKET CAP EXCEEDS REALITY • ANTHROPIC ENGINEER DISCOVERS NEW FORM OF GRIEF • MISTRAL RAISES AT VALUATION OF GROSS DOMESTIC PRODUCT •
breakthroughsWTF 5.8via arXiv cs.AI

Confidence Calibration in Large Language Models

"LLMs are officially as delusional as humans regarding their own intelligence."

Explain Like I'm Normal

A new study reveals that AI models suffer from the same 'hard-easy' bias as people, being wildly overconfident on difficult tasks while underestimating themselves on simple ones. Researchers released LifeEval, a new framework to measure how well a model actually knows what it doesn't know. Improving this calibration is essential for building agents that can safely flag when they need human help.

Read original ↗
#calibration#evaluation#llm#research

GET THE DAILY CHAOS

The only newsletter for people who read AI news at 3am and feel things. One email a day.