CritiCal: Can Natural Language Critiques Help LLM's Uncertainty or Confidence Calibration?

ACL ARR 2026 January Submission8707 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: confidence calibration, uncertainty quantification, natural language critiques
Abstract: Confidence calibration is critical for safe use of Large Language Models (LLMs), where clear verbalized confidence enhances user trust. Traditional methods that mimic reference confidence expressions often fail to utilize the logic in model's original reasoning chain. We propose natural language critique as a solution, which is ideal for confidence calibration, as precise gold confidence labels are hard to obtain and often require multiple generations but assessing whether the confidence is appropriate is easy by analyzing its internal logic and answer correctness. This paper studies how natural language critiques can enhance verbalized confidence: (1) **What to critique**: uncertainty (question-focused) or confidence (answer-specific)? Analysis shows confidence suits multiple-choice tasks, while uncertainty excels in open-ended scenarios. (2) **How to critique**: self-critique or critique calibration training? We propose **Self-Critique**, enabling LLMs to critique and optimize their confidence, and **CritiCal**, using natural language **Criti**ques to train confidence **Cal**ibration. Experiments show that CritiCal significantly outperforms Self-Critique and other competitive baselines, \textbf{even surpassing its teacher, GPT-4o}, in complex reasoning tasks. CritiCal also shows robust generalization in out-of-distribution settings, proving its reliability.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: calibration/uncertainty, free-text/natural language explanations, robustness
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Submission Number: 8707
Loading