Mind the Generation Process: Fine-grained Confidence Estimation Throughout the Generation of LLMs

ACL ARR 2024 December Submission2000 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Accurate confidence estimation of large language models (LLMs) is crucial for improving their generation reliability. While existing methods typically estimate confidence from limited perspectives and specific token positions, they fail to provide continuous confidence estimation throughout the generation process. In this paper, we introduce FineCE, a novel fine-grained confidence estimation method that provides the accurate and real-time confidence scores during the generation. Specifically, we develop a pipeline for construction training data to capture the inherent responses of LLMs, and design data formats for three different tasks to teach LLMs to express confidence. Additionally, we propose the Backward Confidence Integration (BCI) strategy, which integrates confidence scores from subsequent text sequences to provide a holistic confidence estimation for the current text sequence. Furthermore, we provide three strategies to identify the optimal positions to perform confidence estimation. Extensive experiments demonstrate that FineCE consistently outperforms existing baselines in various confidence estimation tasks. Our code and all baselines used in the paper are available in the GitHub https://anonymous.4open.science/r/FineCE/.
Paper Type: Long
Research Area: Generation
Research Area Keywords: Generation
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 2000
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview