Beyond Faithfulness: A Framework to Characterize and Compare Saliency MethodsDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: saliency methods, interpretability, faithfulness, explainability, attribution, feature importance
Abstract: Saliency methods calculate how important each input feature is to a machine learning model’s prediction, and are commonly used to understand model reasoning. “Faithfulness,” or how fully and accurately the saliency output reflects the underlying model, is an oft-cited desideratum for these methods. However, explanation methods must necessarily sacrifice certain information in service of user-oriented goals such as simplicity. To that end, and akin to performance metrics, we frame saliency methods as abstractions: individual tools that provide insight into specific aspects of model behavior and entail tradeoffs. Using this framing, we describe a framework of nine dimensions to characterize and compare the properties of saliency methods. We group these dimensions into three categories that map to different phases of the interpretation process: methodology, or how the saliency is calculated; sensitivity, or relationships between the saliency result and the underlying model or input; and, perceptibility, or how a user interprets the result. As we show, these dimensions give us a granular vocabulary for describing and comparing saliency methods — for instance, allowing us to develop “saliency cards” as a form of documentation, or helping downstream users understand tradeoffs and choose a method for a particular use case. Moreover, by situating existing saliency methods within this framework, we identify opportunities for future work, including filling gaps in the landscape and developing new evaluation metrics.
One-sentence Summary: A framework to describe, compare, and select saliency methods, with nine dimensions corresponding to meaningful attributes about what the method represents and how it is computed.
7 Replies

Loading