Concept-Based Local Unified Explanations

04 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: XAI, post-hoc, Concept-based, Rule-based
Abstract: There is a growing demand to combine model-agnostic explanation methods with concept-based explanations, as the former can explain models across different architectures while the latter makes the explanations more faithful and understandable to end-users. However, existing concept-based model-agnostic explanation methods are limited in scope, as they mainly focus on attribution-based explanations and lack support for richer explanation types such as sufficient conditions and counterfactuals, which limits their applicability. To bridge this gap, we propose a general framework ConLUX, to elevate existing local model-agnostic techniques to provide concept-based explanations. Our key insight is that we can uniformly extend existing local model-agnostic methods to provide unified concept-based explanations with large pre-trained models perturbation. We have instantiated ConLUX to provide concept-based explanations in three forms: attributions, sufficient conditions, and counterfactuals, and applied it to popular text, image, and multimodal models. Our evaluation results demonstrate that \toolname provides explanations more faithful than state-of-the-art concept-based explanation methods, and provides richer explanation forms that satisfy various user needs.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 2092
Loading