Hard ImageNet: Segmentations for Objects with Strong Spurious Cues

Mazda Moayeri; Sahil Singla; Soheil Feizi

Hard ImageNet: Segmentations for Objects with Strong Spurious Cues

Mazda Moayeri, Sahil Singla, Soheil Feizi

Published: 17 Sept 2022, Last Modified: 23 May 2023NeurIPS 2022 Datasets and Benchmarks Readers: Everyone

Keywords: dataset, classification, spurious features, segmentations

Abstract: Deep classifiers are known to rely on spurious features, leading to reduced generalization. The severity of this problem varies significantly by class. We identify $15$ classes in ImageNet with very strong spurious cues, and collect segmentation masks for these challenging objects to form \emph{Hard ImageNet}. Leveraging noise, saliency, and ablation based metrics, we demonstrate that models rely on spurious features in Hard ImageNet far more than in RIVAL10, an ImageNet analog to CIFAR10. We observe Hard ImageNet objects are less centered and occupy much less space in their images than RIVAL10 objects, leading to greater spurious feature reliance. Further, we use robust neural features to automatically rank our images based on the degree of spurious cues present. Comparing images with high and low rankings within a class reveals the exact spurious features models rely upon, and shows reduced performance when spurious features are absent. With Hard ImageNet's image rankings, object segmentations, and our extensive evaluation suite, the community can begin to address the problem of learning to detect challenging objects \emph{for the right reasons}, despite the presence of strong spurious cues.

Author Statement: Yes

URL: mmoayeri.github.io/HardImagenet

TL;DR: A new perspective on classification performance: how can we learn to predict *for the right reasons* when our data is suboptimal (i.e. riddled with spurious cues)

Supplementary Material: pdf

Dataset Url: mmoayeri.github.io/HardImageNet This page and accompanying github repo contains all code to download the data, evaluate models on the benchmark, and generate plots shown in the paper.

License: CC-0: Creative Commons Public Domain Dedication

Contribution Process Agreement: Yes

In Person Attendance: Yes

24 Replies

Loading