Keywords: explainability, class activation mapping, vision models, image classification
TL;DR: We propose a new gradient-free and CAM-based explanation method in image classification, this method can work well on both CNNs and transformer-based models.
Abstract: As deep neural networks continue to achieve considerable success in high-stakes computer vision applications, the demand for transparent and interpretable decision-making is becoming increasingly critical. Post-hoc explanation methods, such as Class Activation Mapping (CAM), were developed to enhance interpretability by highlighting important regions in input images. However, existing methods often treat internal representation (feature maps or patch tokens) as independent and equally important, neglecting their semantic interactions, which can result in irrelevant or noisy signals in the explanation. To overcome these limitations, we propose ClusCAM, a gradient-free post-hoc explanation method that groups internal representations into meaningful clusters, referred to as meta-representations. We then quantify their importance using logit differences with dropout and temperature-scaled softmax to focus on the most influential groups. By modeling group-wise interactions, ClusCAM produces sharper and more interpretable explanations. The approach is architecture-agnostic and applicable to both Convolutional Neural Networks and Vision Transformers. Through our extensive experiments, ClusCAM outperforms the state-of-the-art methods by up to 17.8\% and 27.87\% improvement in Increase in Confidence and Average Gain, respectively, and produces visualizations more faithful to the model's prediction.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 18606
Loading