Visual Explanations for Capsule Networks

TMLR Paper7642 Authors

23 Feb 2026 (modified: 04 Mar 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The limited availability of explainability methods for Capsule Networks (CapsNets) restricts their adoption in critical domains such as clinical practice or legal document analysis. Although CapsNets offer structured and interpretable representations, existing explanation methods have primarily focused on more traditional Convolutional Neural Networks (CNNs) and are not directly applicable to capsule-based architectures. To address this issue, we propose a general method (Caps-CAM), which generates attribution maps to justify the predictions made by feed-forward CapsNet architectures. Unlike prior explanation methods for CapsNets that adapt techniques originally designed for CNNs, Caps-CAM explicitly employs gradient information that reflects the relevance of each capsule to a class of interest. As the gradient can help highlight the most relevant capsules, each selected capsule activation map is weighted by its corresponding gradient. The final attribution heatmap is then generated as a linear combination of weighted activation maps based on their contribution to the target class. Experiments show that Caps-CAM serves as an explanation method for CapsNets and compare the results of this method with other state-of-the-art explanation techniques previously introduced for CNNs. Empirical comparisons w.r.t. state-of-the-art explanation techniques previously introduced for CNNs, show that Caps-CAM can effectively serve as an explanation method for CapsNets. Experiments on standard and real-application data sets show the effectiveness of the introduced Caps-CAM.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Georgios_Leontidis1
Submission Number: 7642
Loading