Step-wise Sensitivity Analysis: Identifying Partially Distributed Representations for Interpretable Deep Learning
Abstract: In this paper, we introduce a novel method, called step-wise sensitivity analysis, which makes three contributions towards increasing the interpretability of Deep Neural Networks (DNNs). First, we are the first to suggest a methodology that aggregates results across input stimuli to gain model-centric results. Second, we linearly approximate the neuron activation and propose to use the outlier weights to identify distributed code. Third, our method constructs a dependency graph of the relevant neurons across the network to gain fine-grained understanding of the nature and interactions of DNN's internal features. The dependency graph illustrates shared subgraphs that generalise across 10 classes and can be clustered into semantically related groups. This is the first step towards building decision trees as an interpretation of learned representations.
Keywords: Interpretability, Interpretable Deep Learning, XAI, dependency graph, sensitivity analysis, outlier detection, instance-specific, model-centric
TL;DR: We find dependency graphs between learned representations as a first step towards building decision trees to interpret the representation manifold.
6 Replies
Loading