Mind the Generation Process: Fine-grained Confidence Estimation Throughout the Generation of LLMs

Mind the Generation Process: Fine-grained Confidence Estimation Throughout the Generation of LLMs

ACL ARR 2024 December Submission2000 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Accurate confidence estimation of large language models (LLMs) is crucial for improving their generation reliability. While existing methods typically estimate confidence from limited perspectives and specific token positions, they fail to provide continuous confidence estimation throughout the generation process. In this paper, we introduce FineCE, a novel fine-grained confidence estimation method that provides the accurate and real-time confidence scores during the generation. Specifically, we develop a pipeline for construction training data to capture the inherent responses of LLMs, and design data formats for three different tasks to teach LLMs to express confidence. Additionally, we propose the Backward Confidence Integration (BCI) strategy, which integrates confidence scores from subsequent text sequences to provide a holistic confidence estimation for the current text sequence. Furthermore, we provide three strategies to identify the optimal positions to perform confidence estimation. Extensive experiments demonstrate that FineCE consistently outperforms existing baselines in various confidence estimation tasks. Our code and all baselines used in the paper are available in the GitHub https://anonymous.4open.science/r/FineCE/.

Paper Type: Long

Research Area: Generation

Research Area Keywords: Generation

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 2000

Loading