Keywords: explainability, concept activation vectors, image classification
TL;DR: We introduce Visual-TCAV, a local and global concept-based explainability method for image classification that produces saliency maps to show where concepts are detected and estimates their importance.
Abstract: Convolutional Neural Networks (CNNs) have seen significant performance improvements in recent years. However, due to their size and complexity, their decision-making process remains a black-box, leading to opacity and trust issues. State-of-the-art saliency methods can generate local explanations that highlight the area in the input image where a class is identified but do not explain how different features contribute to the prediction. On the other hand, concept-based methods, such as TCAV (Testing with Concept Activation Vectors), provide global explainability, but cannot compute the attribution of a concept in a specific prediction nor show the locations where the network detects these concepts. This paper introduces a novel explainability framework, Visual-TCAV, which aims to bridge the gap between these methods. Visual-TCAV uses Concept Activation Vectors (CAVs) to generate saliency maps that show where concepts are recognized by the network. Moreover, it can estimate the attribution of these concepts to the output of any class using a generalization of Integrated Gradients. Visual-TCAV can provide both local and global explanations for any CNN-based image classification model without requiring any modifications. This framework is evaluated on widely used CNNs and its validity is further confirmed through experiments where a ground truth for explanations is known.
Supplementary Material: zip
Primary Area: Interpretability and explainability
Submission Number: 6553
Loading