NormLime: A New Feature Importance Metric for Explaining Deep Neural Networks

Anonymous

Sep 25, 2019 ICLR 2020 Conference Blind Submission readers: everyone Show Bibtex
  • Keywords: Machine Learning, Deep Learning, Interpretability, Feature Importance, Salience
  • TL;DR: We introduce a new salience map (feature importance function) to generate global interpretations, and evaluate the method both quantitatively using a standard ablation technique, as well as qualitatively through a human user study.
  • Abstract: The problem of explaining deep learning models, and model predictions generally, has attracted intensive interest recently. Many successful approaches forgo global approximations in order to provide more faithful local interpretations of the model’s behavior. LIME develops multiple interpretable models, each approximating a large neural network on a small region of the data manifold, and SP-LIME aggregates the local models to form a global interpretation. Extending this line of research, we propose a simple yet effective method, NormLIME, for aggregating local models into global and class-specific interpretations. A human user study strongly favored the class-specific interpretations created by NormLIME to other feature importance metrics. Numerical experiments employing Keep And Retrain (KAR) based feature ablation across various baselines (Random, Gradient-based, LIME, SHAP) confirms NormLIME’s effectiveness for recognizing important features.
0 Replies

Loading