Keywords: causal discovery; heterogeneous data; multiple environment; nonlinear independent component analysis
TL;DR: We show that the causal graphs of structural causal models with arbitrary mechanisms are uniquely identifiable from the auxiliary information of only two environments. This bridges the gap with ICA identifiability results with multiple environments.
Abstract: Causal discovery from i.i.d. observational data is known to be generally ill-posed. We demonstrate that if we have access to the distribution of a structural causal model, and additional data from *only two* environments that sufficiently differ in the noise statistics, the unique causal graph is identifiable. Notably, this is the first result in the literature that guarantees the entire causal graph recovery with a constant number of environments and arbitrary nonlinear mechanisms. Our only constraint is the Gaussianity of the noise terms; however, we propose potential ways to relax this requirement. Of interest on its own, we expand on the well-known duality between independent component analysis (ICA) and causal discovery; recent advancements have shown that nonlinear ICA can be solved from multiple environments, at least as many as the number of sources: we show that the same can be achieved for causal discovery while having access to much less auxiliary information.
Supplementary Material: zip
Primary Area: causal reasoning
Submission Number: 19934
Loading