SA-CAM: Semantic-aware visual explanations for deep convolutional networks

Published: 01 Jan 2025, Last Modified: 31 Jul 2025Mach. Learn. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Explaining deep convolutional networks is a long-standing problem. However, existing Class Activation Mapping (CAM) based methods often produce saliency maps with insufficient authentic information or involve time-consuming generating processes. In view of these challenges, we propose Semantic-Aware Class Activation Mapping (SA-CAM), an effective and efficient post-hoc visual explanation method that considers the semantic correlation of activation maps during saliency map generation. Additionally, in order to reduce the computational cost of multiplying the activation map with the original feature map, we partition them into semantic-related clusters to preserve the input sub-pixels and subsequently sum the sub-activations as an initial mask to calculate the weights. Extensive experiments on STL-10, ImageNet-1k, ImageNet Segmentation, and MS COCO2017 datasets demonstrate that SA-CAM outperforms current state-of-the-art explanation approaches. Furthermore, we experimentally highlight the potential of SA-CAM as an effective data augmentation strategy for fine-tuning, as it requires only dozens of queries to generate accurate class-specific saliency maps.
Loading