Keywords: XAI, post-hoc, Concept-based, Rule-based
Abstract: There is a growing demand to combine model-agnostic explanation methods with concept-based explanations, as the former can explain models across different architectures while the latter makes the explanations more faithful and understandable to end-users.
However, existing concept-based model-agnostic explanation methods are limited in scope, as they mainly focus on attribution-based explanations and lack support for richer explanation types such as sufficient conditions and counterfactuals, which limits their applicability.
To bridge this gap, we propose a general framework ConLUX, to elevate existing local model-agnostic techniques to provide concept-based explanations.
Our key insight is that we can uniformly extend existing local model-agnostic methods to provide unified concept-based explanations with large pre-trained models perturbation.
We have instantiated ConLUX to provide concept-based explanations in three forms: attributions, sufficient conditions, and counterfactuals, and applied it to popular text, image, and multimodal models.
Our evaluation results demonstrate that \toolname provides explanations more faithful than state-of-the-art concept-based explanation methods, and provides richer explanation forms that satisfy various user needs.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 2092
Loading