TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long DocumentsDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Large language models (LLMs) have attracted great interest in many real-world applications given their increasingly accurate responses and coherent reasoning abilities. On the other hand, because of their complex and black-box nature, the demand for scalable and faithful explanations for LLM-generated outputs continues to grow. Explainability methods for deep learning, especially the well-respected Shapley value, have matured significantly over the past decade. However, there are major challenges in extending Shapley values to LLMs, particularly when dealing with long input contexts (containing thousands of tokens) and considering autoregressive generation of output sequences. In this paper, we introduce TextGenSHAP, an efficient post-hoc explanation method incorporating LLM-specific techniques. We demonstrate that this leads to significant runtime improvements compared to conventional Shapley value computations, reducing runtime from hours to minutes for token-level explanations, and to just seconds for document-level explanations. We then demonstrate how such explanations can improve end-to-end performance of retrieval augmented generation, localizing important words from within long documents, and reranking passages collected by retrieval systems. In open-domain question answering on NQ Open and MIRACL, TextGenSHAP improves the recall of document retrieval systems by multiple points and closes the accuracy gap of open-domain question answering with a 5-10\% point improvement.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview