Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases

Aengus Lynch; Gbetondji Jean-Sebastien Dovonon; Jean Kaddour; Ricardo Silva

Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases

Aengus Lynch, Gbetondji Jean-Sebastien Dovonon, Jean Kaddour, Ricardo Silva

Published: 06 Mar 2025, Last Modified: 01 May 2025SCSL @ ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Track: regular paper (up to 6 pages)

Keywords: spurious correlation, benchmark

TL;DR: A Benchmark for Fine Control of Spurious Correlation Biases

Abstract: The problem of spurious correlations (SCs) arises when a classifier relies on non-predictive features that happen to be correlated with the labels in the training data. Previous SC benchmark datasets suffer from varying issues, e.g., over-saturation or only containing one-to-one (O2O) SCs, but no many-to-many (M2M) SCs arising between groups of spurious attributes and classes. In this paper, we present Spawrious-\{O2O, M2M\}-\{Easy, Medium, Hard\}, an image classification benchmark suite containing spurious correlations between classes and backgrounds. We employ a text-to-image model to generate photo-realistic images and an image captioning model to filter out unsuitable ones. The resulting dataset is of high quality and contains approximately 152k images. Our experimental results demonstrate that state-of-the-art group robustness methods struggle with Spawrious.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 41

Loading