Empirical or Invariant Risk Minimization? A Sample Complexity PerspectiveDownload PDF

Published: 12 Jan 2021, Last Modified: 22 Oct 2023ICLR 2021 PosterReaders: Everyone
Keywords: invariant risk minimization, IRM
Abstract: Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization. However, it is unclear when IRM should be preferred over the widely-employed empirical risk minimization (ERM) framework. In this work, we analyze both these frameworks from the perspective of sample complexity, thus taking a firm step towards answering this important question. We find that depending on the type of data generation mechanism, the two approaches might have very different finite sample and asymptotic behavior. For example, in the covariate shift setting we see that the two approaches not only arrive at the same asymptotic solution, but also have similar finite sample behavior with no clear winner. For other distribution shifts such as those involving confounders or anti-causal variables, however, the two approaches arrive at different asymptotic solutions where IRM is guaranteed to be close to the desired OOD solutions in the finite sample regime, while ERM is biased even asymptotically. We further investigate how different factors --- the number of environments, complexity of the model, and IRM penalty weight --- impact the sample complexity of IRM in relation to its distance from the OOD solutions.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: In this work, we provide a sample complexity comparison of the recent invariant risk minimization (IRM) framework with the classic empirical risk minimization (ERM) to answer when is IRM better than ERM in terms of out-of-distribution generalization?
Supplementary Material: zip
Code: [![github](/images/github_icon.svg) IBM/IRM-games](https://github.com/IBM/IRM-games) + [![Papers with Code](/images/pwc_icon.svg) 2 community implementations](https://paperswithcode.com/paper/?openreview=jrA5GAccy_)
Data: [Colored MNIST](https://paperswithcode.com/dataset/colored-mnist)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/arxiv:2010.16412/code)
12 Replies