On Sparse Critical Paths of Neural ResponseDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Interpretability, Feature Attribution, Neural Network Pruning, Critical Paths
Abstract: Is critical input information encoded in specific sparse paths within the network? The pruning objective --- finding a subset of neurons for which the response remains unchanged --- has been used to discover such paths. However, we show that paths obtained from this objective do not necessarily encode the input features and also encompass (dead) neurons that were not originally contributing to the response. We investigate selecting paths based on neurons' contributions to the response to ensure that the paths envelop the critical segments of the encoded input information. We show that these paths have the property of being provably locally linear in an $\ell_{2}$-ball of the input, thus having stable gradients. This property is leveraged for proposing a feature attribution paradigm that is guided by neurons, therefore inherently taking interactions between input features into account. We evaluate the attribution methodology quantitatively in mainstream benchmarks.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We investigate selecting paths that encode the critical input features and show how to leverage them for interpreting the neural response.
Supplementary Material: zip
Reviewed Version (pdf): https://openreview.net/references/pdf?id=3rzSVUW0gX
5 Replies

Loading