Learning Robust Models by Countering Spurious Correlations

Haohan Wang; Zeyi Huang; Eric Xing

Learning Robust Models by Countering Spurious Correlations

Haohan Wang, Zeyi Huang, Eric Xing

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: robustness, domain adaptation, spurious correlation, dataset bias

Abstract: Machine learning has demonstrated remarkable prediction accuracy over i.i.d data, but the accuracy often drops when tested with data from another distribution. One reason behind this accuracy drop is the reliance of models on the features that are only associated with the label in the training distribution, but not the test distribution. This problem is usually known as spurious correlation, confounding factors, or dataset bias. In this paper, we formally study the generalization error bound for this setup with the knowledge of how the spurious features are associated with the label. We also compare our analysis to the widely-accepted domain adaptation error bound and show that our bound can be tighter, with more assumptions that we consider realistic. Further, our analysis naturally offers a set of solutions for this problem, linked to established solutions in various topics about robustness in general, and these solutions all require some understandings of how the spurious features are associated with the label. Finally, we also briefly discuss a method that does not require such an understanding.

One-sentence Summary: We offer a formal generalization error bound of the problem of learning when there are spurious correlated features, with the knowledge of these features. Our bound also leads to discussion of the methods for this problem.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=vPUE2QzXj

6 Replies

Loading