Unified Causal Discovery and Missing Data Imputation

Published: 03 Feb 2026, Last Modified: 03 Feb 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We perform layer-wise ordered causal structure learning with missing data imputation
Abstract: Causal discovery and data imputation are often treated separately, yet both face challenges when data are missing. Existing causal discovery methods discard incomplete samples, leading to significant information loss, while standard imputation approaches rely on spurious correlations that distort the underlying causal signal. We introduce LOGIC, a framework that performs causal discovery and causally consistent imputation simultaneously. While existing work directly makes the assumption that all source variables in the causal graph are observed, we establish a verifiable criterion for this assumption under MCAR and MAR missingness, using the Algorithmic Markov Condition postulate. Building on this, LOGIC proceeds layer by layer, identifying sources, recovering downstream relations, and imputing missing values, while explicitly declaring unknowns when imputation is unsupported rather than forcefully completing the data. This preserves causal reasoning even in challenging missingness regimes. Experiments on synthetic and real-world datasets demonstrate that LOGIC achieves better performance than state-of-the-art baselines in both structure discovery and imputation accuracy.
Submission Number: 958
Loading