New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and SoundDownload PDF

Published: 31 Oct 2022, Last Modified: 13 Jan 2023NeurIPS 2022 AcceptReaders: Everyone
Keywords: saliency, saliency methods, saliency evaluation, soundness, sanity checks, interpretability
Abstract: Saliency methods compute heat maps that highlight portions of an input that were most important for the label assigned to it by a deep net. Evaluations of saliency methods convert this heat map into a new masked input by retaining the $k$ highest-ranked pixels of the original input and replacing the rest with "uninformative" pixels, and checking if the net's output is mostly unchanged. This is usually seen as an explanation of the output, but the current paper highlights reasons why this inference of causality may be suspect. Inspired by logic concepts of completeness & soundness, it observes that the above type of evaluation focuses on completeness of the explanation, but ignores soundness. New evaluation metrics are introduced to capture both notions, while staying in an intrinsic framework---i.e., using the dataset and the net, but no separately trained nets, human evaluations, etc. A simple saliency method is described that matches or outperforms prior methods in the evaluations. Experiments also suggest new intrinsic justifications, based on soundness, for popular heuristic tricks such as TV regularization and upsampling.
TL;DR: Inspired by the idea of soundness from logic systems, this paper provides a new dimension for intrinsic evaluations of saliency methods.
Supplementary Material: pdf
10 Replies

Loading