Quantifying Faithful Confidence Expression in Large Reasoning Models

Published: 03 Jun 2026, Last Modified: 03 Jun 2026AI4GOOD Workshop 2026 SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large reasoning models, calibration/uncertainty, faithfulness
TL;DR: We formalize the problem of faithful confidence expression for large reasoning models and present a novel evaluation framework that addresses the unique challenges of studying long-form, dynamically evolving reasoning traces.
Abstract: Reliable uncertainty communication is critical to the trustworthiness of LLMs, yet faithful calibration (FC)—the alignment between models' intrinsic and linguistic expressed confidence—remains a persistent failure mode. This challenge is especially important for large reasoning models (LRMs), whose extended reasoning traces are often interpreted by users as evidence of competence and certainty. Despite this, the extent to which LRMs faithfully express their confidence is poorly understood, and the prevailing paradigm to measure FC does not generalize well to long chain-of-thought traces. We introduce a framework to systematically quantify FC of LRMs, analyzing linguistic decisiveness across three sources of internal uncertainty—token probabilities, hidden states, and sampling consistency. We also introduce a prefix-conditioned sampling approach to control for conditional dependence and step structure variation across responses. Applying our method across leading models, datasets, and prompts, we find that FC remains a significant challenge for LRMs: reasoning does not automatically improve FC, and prompt interventions that improve FC for non-reasoning models do not transfer to LRMs. Confidence estimators further produce divergent views of the same trace, revealing fragility in prior evaluation methods. Overall, we establish FC as a crucial reliability and alignment target for LRMs, particularly as these models enter high-stakes contexts.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 275
Loading