Explanatory Masks for Neural Network Interpretability

Lawrence Phillips, Nathan Hodas, Garrett B. Goh

Feb 12, 2018 (modified: Feb 12, 2018) ICLR 2018 Workshop Submission readers: everyone
  • Abstract: Neural network interpretability is a vital component for applications across a wide variety of domains. One way to explain neural networks is to indicate which input data is responsible for the decision via a data mask. In this work, we present a method to produce explanation masks for pre-trained networks. Our masks identify which parts of the input are most important for accurate prediction. Masks are created by a secondary network whose goal is to create as small an explanation as possible while still preserving predictive accuracy. We demonstrate the applicability of our method for image classification with CNNs, sentiment analysis with RNNs, and chemical property prediction with mixed CNN/RNN architectures.
  • TL;DR: Network-generated input masks provide explanations for how pre-trained networks make their decisions.
  • Keywords: Deep Learning, Model Explanation, Interpretability