Entropy Reveals What You Know: An Entropy-Guided Method for Enhancing the Reliability of Large Language Models
Keywords: Reliability, Large language Model
Abstract: While large language models (LLMs) encode vast amounts of knowledge within their parameters for some mainstream entities, factual inconsistencies and untruthfulness in LLMs often lead to unreliable responses and cause significant risks in practical applications.
This paper aims to improve model reliability by enhancing consistency in answers to known facts and encouraging refusal to answer for uncertain questions.
Specifically, we introduce \textbf{SREF}, an entropy-guided approach designed to enhance the reliability of language models by incorporating \textbf{S}elf-\textbf{REF}erences, models' understanding of rephrasing questions, with inputs.
We analyze and reveal the effectiveness of SREF in enhancing model reliability from the perspectives of entropy and KL divergence.
Extensive experiments on 12 LLMs demonstrate that outputs generated with SREF yield more reliable results, including an average improvement of 16.01\% over the baselines and a 15.10\% average improvement in consistency, while also adapting to identify and acknowledge uncertain facts.
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10543
Loading