Context Attribution with Multi-Armed Bandit Optimization

Context Attribution with Multi-Armed Bandit Optimization

ACL ARR 2026 January Submission8129 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Explainable AI, Explainable Machine Learning, Context Attribution, Context-based QA

Abstract: Understanding which parts of the retrieved context contribute to a large language model's generated answer is essential for building interpretable and trustworthy retrieval-augmented generation. We propose a novel framework that formulates context attribution as a combinatorial multi-armed bandit problem. We utilize Linear Thompson Sampling to efficiently identify the most influential context segments while minimizing the number of model queries. Our reward function leverages token log-probabilities to measure how well a subset of segments supports the original response, making it applicable to both open-source and black-box API-based models. Unlike SHAP and other perturbation-based methods that sample subsets uniformly, our approach adaptively prioritizes informative subsets based on posterior estimates of segment relevance, reducing computational costs. Experiments on multiple QA benchmarks demonstrate that our method achieves up to 30\% reduction in model queries while matching or exceeding the attribution quality of existing approaches.

Paper Type: Short

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: feature attribution,interpretability,explanation faithfulness

Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 8129

Loading