Keywords: Deep learning, Feature Importance, Information Theory
TL;DR: We use multivariate mutual information to investigate which groups of neurons contain synergistic and redundant information about the classification decision of a trained neural network
Abstract: Quantifying which neurons are important with respect to the classification decision of a trained neural network is essential for understanding their inner workings. Previous work primarily attributed importance to individual neurons. In this work, we study which groups of neurons contain synergistic or redundant information using a multivariate expansion of the mutual information (O-information). We observe that the first layer is dominated by redundancy, suggesting general shared features (i.e. detecting edges), while the last layer is dominated by synergy, indicating local class-specific features (i.e. concepts). Finally, we show that the O-information can be used for multi-neuron importance. This can be demonstrated by re-training a synergistic sub-network, which results in a minimal change in performance. These results suggest that our method can be used for pruning and unsupervised representation learning.
In-person Presentation: yes