Explanatory Masks for Neural Network Interpretability

Lawrence Phillips; Nathan Hodas; Garrett B. Goh

Explanatory Masks for Neural Network Interpretability

Lawrence Phillips, Nathan Hodas, Garrett B. Goh

12 Feb 2018 (modified: 05 May 2023)ICLR 2018 Workshop SubmissionReaders: Everyone

Abstract: Neural network interpretability is a vital component for applications across a wide variety of domains. One way to explain neural networks is to indicate which input data is responsible for the decision via a data mask. In this work, we present a method to produce explanation masks for pre-trained networks. Our masks identify which parts of the input are most important for accurate prediction. Masks are created by a secondary network whose goal is to create as small an explanation as possible while still preserving predictive accuracy. We demonstrate the applicability of our method for image classification with CNNs, sentiment analysis with RNNs, and chemical property prediction with mixed CNN/RNN architectures.

Keywords: Deep Learning, Model Explanation, Interpretability

TL;DR: Network-generated input masks provide explanations for how pre-trained networks make their decisions.

4 Replies

Loading