Demystifying Uncertainty in LLMs: Active Calibration between Concepts and Human Evaluations

Demystifying Uncertainty in LLMs: Active Calibration between Concepts and Human Evaluations

ACL ARR 2026 January Submission1485 Authors

30 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Uncertainty Quantification; Interactive Learning; Calibration Error; Human–AI Interaction; Theoretical Analysis

Abstract: Hallucinations arise when large language models (LLMs) guess rather than acknowledge their underlying uncertainty. Existing static strategies for mitigating hallucinations have been only partially successful, largely because they do not explicitly model the information gain from interacting with the external environment. Researchers need a general method to proactively steer users toward informative clarifications, thereby unlocking the model’s effective capacity under underspecified inputs. We model the uncertainty of LLMs in interactive settings and uncover the mechanism of active calibration between model concepts and human evaluations, improving reliability. We show that calibration error in LLMs density estimation admits a non-vanishing lower bound under non-interactive learning, while interaction empirically reduces it. We further characterize that calibration error identifies informative queries and that calibration can be accelerated by shifting query distributions from imbalanced to balanced regimes. Guided by these insights, we propose a calibration-driven Interactive Learning Strategy (ILS) that selects clarification queries by optimizing calibration error, providing both theoretical guarantees and empirical gains for reliability. Code and data are available at https://anonymous.4open.science/r/DemystifyingUncertainty/.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: Uncertainty Quantification; Interactive Learning; Calibration Error; Human–AI Interaction; Theoretical Analysis

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Theory

Languages Studied: English

Submission Number: 1485

Loading