Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification

TMLR Paper6632 Authors

24 Nov 2025 (modified: 09 Dec 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Convolutional Neural Networks (CNNs) have shown remarkable performance in image classification. However, interpreting their predictions is challenging due to the size and complexity of these models. State-of-the-art saliency methods generate local explanations highlighting the area in the input image where a class is identified but cannot explain how a concept of interest contributes to the prediction. On the other hand, concept-based methods, such as TCAV, provide insights into how sensitive the network is to a human-defined concept but cannot compute its attribution in a specific prediction nor show its location within the input image. We introduce Visual-TCAV, a novel explainability framework aiming to bridge the gap between these methods by providing both local and global explanations. Visual-TCAV uses Concept Activation Vectors (CAVs) to generate class-agnostic saliency maps that show where the network recognizes a certain concept. Moreover, it can estimate the attribution of these concepts to the output of any class using a generalization of Integrated Gradients. We evaluate the method's faithfulness via a controlled experiment where the ground truth for explanations is known, showing better ground truth alignment than TCAV. Our code is available at (see supplementary material .zip file).
Submission Type: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=A51FgXXT72&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: The previous version was desk-rejected due to the absence of a quantitative comparison with saliency methods and concept bottleneck models (CBMs). In this revised submission, we added (in Appendix F.2) a quantitative comparison with Grad-CAM and Integrated Gradients, which are among the most popular and widely used saliency methods. We did not add comparisons with CBMs, as these models address a different problem (ante-hoc rather than post-hoc interpretability, which is the focus of this work).
Assigned Action Editor: ~Fernando_Perez-Cruz1
Submission Number: 6632
Loading