Confidence Calibration in Large Language Models

Confidence Calibration in Large Language Models

ACL ARR 2026 January Submission7024 Authors

06 Jan 2026 (modified: 27 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Alignment, Human Computer Interaction, Bias, LLMs, Overconfidence, Hallucination

Abstract: We investigate the calibration of large language models' (LLMs') confidence across diverse tasks. The results of our preregistered study show that the current crop of LLMs are, like people, too sure they are right: confidence exceeds accuracy, on average. However, this tendency toward overconfidence is moderated by a powerful hard-easy effect, wherein overconfidence is greatest on difficult tests; by contrast, easy tests actually show substantial underconfidence. We develop LifeEval, a test for evaluating model calibration across levels of difficulty.

Paper Type: Long

Research Area: Safety and Alignment in LLMs

Research Area Keywords: agent communication, safety and alignment for agents, grounded agents, model bias/fairness evaluation, human-AI interaction/cooperation, safety and alignment

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 7024

Loading