Robust Invariant Representation Learning by Distribution Extrapolation

TMLR Paper4907 Authors

22 May 2025 (modified: 23 May 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Invariant risk minimization (IRM) aims to enable out-of-distribution (OOD) generalization in deep learning by learning invariant representations. As IRM poses an inherently challenging bi-level optimization problem, most existing approaches---including IRMv1---adopt penalty-based single-level approximations. However, empirical studies consistently show that these methods often fail to outperform well-tuned empirical risk minimization (ERM), highlighting the need for more robust IRM implementations. This work theoretically identifies a key limitation common to many IRM variants: their penalty terms are highly sensitive to limited environment diversity and over-parameterization, resulting in performance degradation. To address this issue, a novel extrapolation-based framework is proposed that enhances environmental diversity by augmenting the IRM penalty through synthetic distributional shifts. Extensive experiments---ranging from synthetic setups to realistic, over-parameterized scenarios---demonstrate that the proposed method consistently outperforms state-of-the-art IRM variants, validating its effectiveness and robustness.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~changjian_shui1
Submission Number: 4907
Loading