From Identifiable Causal Representations to Controllable Counterfactual Generation: A Survey on Causal Generative Modeling

Published: 23 May 2024, Last Modified: 23 May 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Deep generative models have shown tremendous capability in data density estimation and data generation from finite samples. While these models have shown impressive performance by learning correlations among features in the data, some fundamental shortcomings are their lack of explainability, tendency to induce spurious correlations, and poor out-of-distribution extrapolation. To remedy such challenges, recent work has proposed a shift toward causal generative models. Causal models offer several beneficial properties to deep generative models, such as distribution shift robustness, fairness, and interpretability. Structural causal models (SCMs) describe data-generating processes and model complex causal relationships and mechanisms among variables in a system. Thus, SCMs can naturally be combined with deep generative models. We provide a technical survey on causal generative modeling categorized into causal representation learning and controllable counterfactual generation methods. We focus on fundamental theory, methodology, drawbacks, datasets, and metrics. Then, we cover applications of causal generative models in fairness, privacy, out-of-distribution generalization, precision medicine, and biological sciences. Lastly, we discuss open problems and fruitful research directions for future work in the field.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: - Reformatted tables and figures for better readability - Replaced PNG images with PDF vector images for better readability - Added a few more references to counterfactual inference in Section 4.4 (Other Works) for completeness
Assigned Action Editor: ~Jinwoo_Shin1
Submission Number: 1704