Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference

Hanqi Yan; Jiazheng Li; Yulan He

Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference

Hanqi Yan, Jiazheng Li, Yulan He

28 Sept 2024 (modified: 15 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: interpretability, faithfulness, Large language model, constrained generation

TL;DR: We propose a probabilistic inference paradigm that provides fine-grained and lookahead rewards to ensure that LLM-generated rationales are accurate and faithful..

Abstract: As large language models (LLMs) are increasingly applied to complex reasoning tasks, achieving both accurate task performance and faithful explanations becomes crucial. However, LLMs often generate unfaithful explanations, partly because they do not consistently adhere closely to the provided context. Existing approaches address this problem either rely on superficial calibration, such as decomposed Chain-of-Thought prompting, or require costly retraining to improve model faithfulness. In this work, we propose a probabilistic inference paradigm that provides fine-grained and lookahead rewards to ensure that LLM-generated rationales are logically coherent and comprehensive. These rewards are derived from a domain-specific proposal distribution, allowing for optimised sequential Monte Carlo approximations. Our evaluations across three different reasoning tasks show that this method, which allows for controllable generation during inference, improves both accuracy and faithfulness of LLMs while keeping computational costs similar to those of existing decoding techniques. This method offers a promising path towards making LLMs more reliable for reasoning tasks without sacrificing performance or efficiency.

Primary Area: interpretability and explainable AI

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 14116

Loading