Abstract: Understanding the causal graph underlying a system is essential for enabling causal inference, particularly in fields such as medicine and genetics. Identifying a causal Directed Acyclic Graph (DAG) from observational data alone is challenging because multiple DAGs can encode the same set of conditional independencies, collectively represented by a Completed Partially Directed Acyclic Graph (CPDAG). Effectively approximating the CPDAG is crucial because it facilitates narrowing down the set of possible causal graphs underlying the data. We introduce CPDAG-GFN, a novel approach that uses a Generative Flow Network (GFlowNet) to learn a posterior distribution over CPDAGs. From this distribution, we can sample to create a set of plausible candidates that approximate the ground truth. This method focuses on sampling high-reward CPDAGs, with rewards determined by a score function that quantifies how well each graph fits the data. Experimental results on both simulated and real-world datasets demonstrate that CPDAG-GFN performs competitively with state-of-the-art methods for learning CPDAG candidates from observational data.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Fabio_Stella1
Submission Number: 3298
Loading