FingER: Fact-Level Answerability for Explainable Refusals in Multi-Hop RAG

ICLR 2026 Conference Submission24070 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, hallucination, Retrieval Augmented Generation
Abstract: Large language models (LLMs) are extensively adopted in retrieval-augmented generation (RAG) systems for solving multi-hop reasoning tasks. While prior works effectively utilize retrieved external knowledge, they often neglect internal factual knowledge in the LLM, resulting in excessive answer refusals with limited explanations. To address this, we propose FingER (Fine-grained Explainable Refusal), a post training approach aimed to elicit the model's ability of using its internal factual knowledge when the external knowledge is missing. Furthermore, FingER is able to provide well-reasoned, explainable justifications for its refusals by analyzing the fact verification status at each step of a multi-hop process. Experimental results on MuSiQue dataset demonstrate that FingER effectively balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of multi-hop RAG settings.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 24070
Loading