Keywords: llm, nlg, uncertainty estimation, uncertainty measures, proper scoring rules
TL;DR: We propose a theoretically grounded uncertainty measure for LLMs that significantly reduces computational costs while maintaining state-of-the-art performance.
Abstract: Large language models (LLMs) are increasingly employed in real-world applications, driving a need to determine when their generated text can be trusted or should be questioned. To assess the trustworthiness of the generated text, reliable uncertainty estimation is essential. Current LLMs generate text through a stochastic process that can lead to different output sequences for the same prompt. Consequently, leading uncertainty measures require generating multiple output sequences to estimate the LLM’s uncertainty. However, generating additional output sequences is computationally expensive, making these uncertainty estimates impractical at scale. In this work, we challenge the theoretical foundations of the leading measures and derive an alternative measure that eliminates the need for generating multiple output sequences. Our new measure is based solely on the negative log-likelihood of the most likely output sequence. This vastly simplifies uncertainty estimation while maintaining theoretical rigor. Empirical results demonstrate that our new measure achieves state-of-the-art performance across various models and tasks. Our work lays the foundation for reliable and efficient uncertainty estimation in LLMs, challenging the necessity of the more complicated methods currently leading the field.
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10283
Loading