Conceptual Archetype Decomposition for Interpretable and Generalizable Model Decisions

Zhiyu Zhu; Jiayu Zhang; Xinyi Wang; Zhibo Jin; Huijie Xu; NAN YANG; Fang Chen; Jianlong Zhou

Conceptual Archetype Decomposition for Interpretable and Generalizable Model Decisions

Zhiyu Zhu, Jiayu Zhang, Xinyi Wang, Zhibo Jin, Huijie Xu, NAN YANG, Fang Chen, Jianlong Zhou

15 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Archetype Decomposition, Model Transparency

Abstract: Traditional concept decomposition methods have made significant progress in improving the interpretability of deep learning models, but they still face many challenges. A key issue is that they often lack traceable explanations for concepts, making it difficult to understand and verify how models make decisions and provide explanations based on specific concepts. To overcome this limitation, this paper proposes a new method—Conceptual Archetype Decomposition (CAD)—which aims to provide more interpretable concept learning and decision-making process. Unlike existing methods, our approach ensures that each concept can be represented as a linear combination of training samples, with its total activation value equal to 1. This constraint limits the learning space of the concepts and enhances their interpretability. Therefore, the advantage of our method lies in its fine-grained concept activation decomposition, which directly constructs the explanatory space between training samples and concepts. Through a dual-index decision mechanism, we deeply analyze the relationship between test samples and training samples. Extensive experiments on the CUB and ImageNet datasets demonstrate that our model not only improves decision transparency but also exhibits stronger generalization ability in multi-class classification tasks. Our code is available at: https://anonymous.4open.science/r/CAD-4510/

Primary Area: interpretability and explainable AI

Submission Number: 6082

Loading