Saliency Maps Give a False Sense of Explanability to Image Classifiers: An Empirical Evaluation across Methods and Metrics

Published: 05 Sept 2024, Last Modified: 12 Nov 2024ACML 2024 Conference TrackEveryoneRevisionsBibTeXCC BY 4.0
Keywords: saliency maps; interpretability; image classification
Verify Author List: I have double-checked the author list and understand that additions and removals will not be allowed after the submission deadline.
Abstract: The interpretability of deep neural networks (DNNs) has emerged as a crucial area of research, particularly in image classification tasks where decisions often lack transparency. Saliency maps have been widely used as a tool to decode the inner workings of these networks by highlighting regions of input images deemed most influential in the classification process. However, recent studies have revealed significant limitations and inconsistencies in the utility of saliency maps as explanations. This paper aims to systematically assess the shortcomings of saliency maps and explore alternative approaches to achieve more reliable and interpretable explanations for image classification models. We carry out a series of experiments to show that 1) the existing evaluation does not provide a fair nor meaningful comparison to the existing saliency maps; these evaluations have their implicit assumption and are not differentiable; 2) the saliency maps do not provide enough information on explaining the accuracy of network, the relationship between classes and the modification of the images.
A Signed Permission To Publish Form In Pdf: pdf
Supplementary Material: pdf
Primary Area: Trustworthy Machine Learning (accountability, explainability, transparency, causality, fairness, privacy, robustness, autoML, etc.)
Paper Checklist Guidelines: I certify that all co-authors of this work have read and commit to adhering to the guidelines in Call for Papers.
Student Author: No
Submission Number: 135
Loading