Generate but Verify: Answering with Faithfulness in RAG-based Question Answering

ACL ARR 2025 February Submission2319 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Retrieval-Augmented Generation (RAG) enhances LLMs by grounding answers in retrieved passages, which is key in factual Question Answering. However, generated answers may still be unfaithful, either due to retrieval or generation errors. We introduce the problem of Answering with Faithfulness (AwF), which brings faithfulness prediction to the forefront, explicitly coupling it with answer generation. We define precision-recall metrics tailored to this problem and present a unified framework allowing for (1) tunable control over faithfulness precision and (2) direct evaluation and comparison of different AwF methods. We conduct a comprehensive empirical study across multiple models and benchmarks, evaluating diverse AwF methods, and identifying consistent performance trends. Additionally, we demonstrate the usage of AwF methods in applications that incorporate different strategies for handling unfaithful answers. Our findings establish AwF as a robust framework, providing a principled approach to balance between providing answers and applying corrective actions in RAG-based Question Answering.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: retrieval-augmented generation, knowledge base QA, robustness
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 2319
Loading