Keywords: Domain Generalization, Out-of-distribution Generalization, Transfer Learning, Distribution Shift, Covariate Shift
TL;DR: We use the Markov property of causal chains to identify a causal, and consequently domain general, representation that is invariant to distribution shift.
Abstract: Invariant Causal Prediction provides a framework for domain (or out-of-distribution) generalization -- predicated on the assumption of invariant causal mechanisms that are constant across the data distributions of interest. Accordingly, given a sufficient number of distinct training distributions, the Invariant Risk Minimization (IRM) objective was proposed to learn this stable structure. However, recent work has identified the limitations of IRM when extended to data-generating mechanisms that are different from those considered in its formulation. This work considers a chain generative process where domain-specific exogenous factors influence all features -- but the target is free of direct domain-specific influences. We propose a target conditioned representation independence (TCRI) constraint, which enforces the mediative effect of the observed target with respect to the causal chain of latent features we aim to identify. We empirically show a setting where this approach outperforms both Empirical Risk Minimization (ERM) and IRM.