Finding and Fixing Spurious Patterns with Explanations

Gregory Plumb; Marco Tulio Ribeiro; Ameet Talwalkar

Finding and Fixing Spurious Patterns with Explanations

Gregory Plumb, Marco Tulio Ribeiro, Ameet Talwalkar

Published: 16 Aug 2022, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Authors that are also TMLR Expert Reviewers: ~Gregory_Plumb2

Abstract: Image classifiers often use spurious patterns, such as “relying on the presence of a person to detect a tennis racket,” which do not generalize. In this work, we present an end-to-end pipeline for identifying and mitigating spurious patterns for such models, under the assumption that we have access to pixel-wise object-annotations. We start by identifying patterns such as “the model’s prediction for tennis racket changes 63% of the time if we hide the people.” Then, if a pattern is spurious, we mitigate it via a novel form of data augmentation. We demonstrate that our method identifies a diverse set of spurious patterns and that it mitigates them by producing a model that is both more accurate on a distribution where the spurious pattern is not helpful and more robust to distribution shift.

Certifications: Expert Certification

Submission Length: Regular submission (no more than 12 pages of main content)

Code: https://github.com/GDPlumb/SPIRE

Assigned Action Editor: ~Ekin_Dogus_Cubuk1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 146

Loading