Causal Discovery Beyond Scaling: Mixed-Type DAG Learning with Native Missing-Data Inference

Published: 24 Apr 2026, Last Modified: 24 Apr 2026CauScale 2026EveryoneRevisionsCC BY 4.0
Keywords: Causal discovery, Mixed-type data, Missing data, Predictive coding, DAG structure learning, Missingness-aware inference, MCAR and MAR robustness, Scaling limits in causal ML
TL;DR: When learning causal DAGs from mixed and partially observed data, increasing optimization scale alone cannot recover structure quality after complete-case row collapse, while explicit masked causal modeling remains robust.
Abstract: Scaling predictive optimization alone is not sufficient for causal discovery when data are mixed-type and partially observed. We study this setting with PredCoM, a predictive-coding DAG learner that combines mixed-node likelihoods, sparse acyclicity regularization, and native masked state inference in a single objective. Across ER/SF/WS synthetic benchmarks (RAW and CATMIX; complete, MCAR, and MAR), PredCoM is consistently competitive against NOTEARS, Peter Clark, DirectLiNGAM, LiM, mCMIkNN, and sortnregress. Missingness-rate and MAR-strength sweeps show that complete-case preprocessing deteriorates as retained-row fraction collapses, while masked training remains substantially stronger. A compute-budget ablation shows that increasing epochs does not rescue complete-case failures when data viability is near zero. The results identify a concrete boundary of scaling: in mixed-data causal discovery, optimization budget cannot substitute for explicit missingness-aware causal modeling.
Submission Number: 15
Loading