Axiomatization of Concept CNN Explanations

ICLR 2026 Conference Submission16422 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Concept, Explainability, CNN, Trustworthiness, Axiom
TL;DR: The paper proposes a set of rules for evaluating concept-based CNNs guided by prototypes.
Abstract: Concept-based explanations for convolutional neural networks (CNNs) offer human-interpretable insights into the decision-making processes of artificial intelligence (AI) models. In contrast to attribution-based methods, which primarily highlight salient pixels, concept-based approaches capture higher-level semantic features, thereby elucidating not only where the model looked but also what it saw. Despite their promise, the absence of rigorous axiomatic foundations has impeded systematic evaluation, comparison, and compliance, limiting their broader adoption. This paper presents a conceptual axiomatic framework, derived from the principles of explanation logic, for evaluating the faithfulness of concept-based explanations in CNN-driven image classification. We propose a novel set of axioms that formalize essential criteria for trustworthy explanations and establish a quantitative methodology for their evaluation. Extensive experiments conducted in both ideal and adversarial settings, across diverse model architectures, demonstrate the necessity and validity of these axioms. Our findings contribute to the development of reliable, interpretable, and trustworthy explainable artificial intelligence (XAI) frameworks, with particular relevance to high-stakes domains where transparent decision-making is crucial.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 16422
Loading