Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Yang Zhang; Ashkan Khakzar; Yawei Li; Azade Farshad; Seong Tae Kim; Nassir Navab

Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Yang Zhang, Ashkan Khakzar, Yawei Li, Azade Farshad, Seong Tae Kim, Nassir Navab

Published: 09 Nov 2021, Last Modified: 08 Sept 2024NeurIPS 2021 PosterReaders: Everyone

Keywords: Feature attribution, Interpretability, Explainable AI

Abstract: One principal approach for illuminating a black-box neural network is feature attribution, i.e. identifying the importance of input features for the network’s prediction. The predictive information of features is recently proposed as a proxy for the measure of their importance. So far, the predictive information is only identified for latent features by placing an information bottleneck within the network. We propose a method to identify features with predictive information in the input domain. The method results in fine-grained identification of input features' information and is agnostic to network architecture. The core idea of our method is leveraging a bottleneck on the input that only lets input features associated with predictive latent features pass through. We compare our method with several feature attribution methods using mainstream feature attribution evaluation experiments. The code is publicly available.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: We propose a feature attribution method via identifying input features with predictive information.

Supplementary Material: zip

Code: https://github.com/CAMP-eXplain-AI/InputIBA

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/fine-grained-neural-network-explanation-by/code)

8 Replies

Loading