ClusCAM: Clustered Visual Explanations for Vision Models in Image Classification

ClusCAM: Clustered Visual Explanations for Vision Models in Image Classification

ICLR 2026 Conference Submission18606 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: explainability, class activation mapping, vision models, image classification

TL;DR: We propose a new gradient-free and CAM-based explanation method in image classification, this method can work well on both CNNs and transformer-based models.

Abstract: As deep neural networks continue to achieve considerable success in high-stakes computer vision applications, the demand for transparent and interpretable decision-making is becoming increasingly critical. Post-hoc explanation methods, such as Class Activation Mapping (CAM), were developed to enhance interpretability by highlighting important regions in input images. However, existing methods often treat internal representation (feature maps or patch tokens) as independent and equally important, neglecting their semantic interactions, which can result in irrelevant or noisy signals in the explanation. To overcome these limitations, we propose ClusCAM, a gradient-free post-hoc explanation method that groups internal representations into meaningful clusters, referred to as meta-representations. We then quantify their importance using logit differences with dropout and temperature-scaled softmax to focus on the most influential groups. By modeling group-wise interactions, ClusCAM produces sharper and more interpretable explanations. The approach is architecture-agnostic and applicable to both Convolutional Neural Networks and Vision Transformers. Through our extensive experiments, ClusCAM outperforms the state-of-the-art methods by up to 17.8\% and 27.87\% improvement in Increase in Confidence and Average Gain, respectively, and produces visualizations more faithful to the model's prediction.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 18606

Loading