Step-wise Sensitivity Analysis: Identifying Partially Distributed Representations for Interpretable Deep Learning

Botty Dimanov, Mateja Jamnik

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: In this paper, we introduce a novel method, called step-wise sensitivity analysis, which makes three contributions towards increasing the interpretability of Deep Neural Networks (DNNs). First, we are the first to suggest a methodology that aggregates results across input stimuli to gain model-centric results. Second, we linearly approximate the neuron activation and propose to use the outlier weights to identify distributed code. Third, our method constructs a dependency graph of the relevant neurons across the network to gain fine-grained understanding of the nature and interactions of DNN's internal features. The dependency graph illustrates shared subgraphs that generalise across 10 classes and can be clustered into semantically related groups. This is the first step towards building decision trees as an interpretation of learned representations.
  • Keywords: Interpretability, Interpretable Deep Learning, XAI, dependency graph, sensitivity analysis, outlier detection, instance-specific, model-centric
  • TL;DR: We find dependency graphs between learned representations as a first step towards building decision trees to interpret the representation manifold.
0 Replies