SALSA: SALiency-based Source Attribution for RAG Systems

ACL ARR 2024 December Submission2414 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Retrieval-Augmented Generation (RAG) systems are being rapidly adopted to provide LLMs with access to up-to-date external knowledge without the need to constantly re-train. A major challenge with the adoption of RAG systems is \textit{user trust} - users need to be able to quickly verify that system responses are factually correct given the retrieved knowledge. Attribution (citation) systems address this need. However, most implementations either rely on hallucination-prone prompting methods or post-hoc analysis which may not reflect the actual information used by the LLM during response generation. We propose an attribution method for RAG systems that requires no special prompting or external evaluation - instead, it relies only on the LLM itself with the original context presented in the user query and the subsequently generated response. Specifically, we use the response loss to compute a saliency map over the entire context, including the retrieved documents. We then derive sets of context spans likely to support or conflict with each sentence in the response. Experiments with end-to-end RAG pipelines show that the proposed saliency-based approach outperforms prompting on granular span attribution while being orders of magnitude more efficient. Additionally, by deriving saliency measurements directly from the LLM, we maximize the likelihood that the cited text actually influenced the response, providing better explainability.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: retrieval-augmented generation, interpretability, feature attribution
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 2414
Loading