Beyond Pixels: A Sample Based Method for understanding the decisions of Neural Networks

Ohi Dibua; Mackenzie Austin; Kushal Kafle

Beyond Pixels: A Sample Based Method for understanding the decisions of Neural Networks

Ohi Dibua, Mackenzie Austin, Kushal Kafle

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Machine Learning Interpretability, Bias, ImageNet, AlexNet, ResNet, VGG-16, Inception, CNNs, MNIST

Abstract: Interpretability in deep learning is one of the largest obstacles to more widespread adoption of deep learning in critical applications. A variety of methods have been introduced to understand and explain decisions made by large neural networks. A class of these methods are algorithms that attempt to highlight which input or feature subset was most influential to model predictions. We identify two key weaknesses in existing methods. First, most existing methods do not provide a formal measure of which features are important on their own, and which are important due to correlations with others. Second, many of these methods are only applied to the most granular component of input features (e.g., pixels). We partially tackle these problems by proposing a novel Morris Screening based sensitivity analysis method using input-partitioning (MoSIP). MoSIP allows us to quantify local and global importance of less granular aspects of input space, and helps highlight which parts of inputs are individually important and which are potentially important due to correlations. Through experiments on both MNIST with spurious correlations (Biased-MNIST), and the large scale ImageNet-1K dataset, we reveal several new and interesting findings. Our key finding is that newer CNN architectures (e.g., ResNet) compared to older architectures (e.g., VGG) do not extract fundamentally more relevant features, but simply make stronger use of non-linearities and feature interactions. This can manifest itself in the use of spurious correlations in the data to make decisions.

One-sentence Summary: This paper introduces a sample based method for understanding how semantic representations of inputs (eg image regions and color) impact model predictions.

5 Replies

Loading