Overinterpretation reveals image classification model pathologies

Brandon Carter; Siddhartha Jain; Jonas Mueller; David Gifford

Overinterpretation reveals image classification model pathologies

Brandon Carter, Siddhartha Jain, Jonas Mueller, David Gifford

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: computer vision, benchmarks, datasets, convolutional neural networks, interpretability, robustness, overinterpretation

Abstract: Image classifiers are typically scored on their test set accuracy, but high accuracy can mask a subtle type of model failure. We find that high scoring convolutional neural networks (CNNs) on popular benchmarks exhibit troubling pathologies that allow them to display high accuracy even in the absence of semantically salient features. When a model provides a high-confidence decision without salient supporting input features, we say the classifier has overinterpreted its input, finding too much class-evidence in patterns that appear nonsensical to humans. Here, we demonstrate that neural networks trained on CIFAR-10 and ImageNet suffer from overinterpretation, and we find models on CIFAR-10 make confident predictions even when 95% of input images are masked and humans cannot discern salient features in the remaining pixel-subsets. Although these patterns portend potential model fragility in real-world deployment, they are in fact valid statistical patterns of the benchmark that alone suffice to attain high test accuracy. Unlike adversarial examples, overinterpretation relies upon unmodified image pixels. We find ensembling and input dropout can each help mitigate overinterpretation.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We demonstrate that neural networks trained on CIFAR-10 and ImageNet overinterpret their inputs and rely upon semantically meaningless features present in unmodified input pixels.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 8 code implementations](https://www.catalyzex.com/paper/overinterpretation-reveals-image/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=rBcPj_gTu_

11 Replies

Loading