Abstract: Inferring the causal relationships among a set of variables in the form of a directed acyclic graph (DAG) is an important but notoriously challenging problem. Recently, advancements in high-throughput genomic perturbation screens have inspired the development of methods that leverage interventional data to improve model identification. However, existing methods still suffer from poor performance on large-scale tasks and fail to quantify uncertainty. Here, we propose Interventional Bayesian Causal Discovery (IBCD), an empirical Bayesian framework that infers the causal graph by using intervention data to estimate the effect of each variable on every other, then inferring the posterior graph given these estimates. For tractability, our approach models the likelihood of the matrix of estimated total causal effects, which can be approximated by a matrix normal distribution, rather than the full data matrix. We place a spike-and-slab horseshoe prior on the edges and separately learn data-driven weights for scale-free and Erdős–Rényi structures from observational data, treating each edge as a latent variable to enable uncertainty-aware inference. Through extensive simulation, we show that IBCD achieves superior structure recovery compared to existing baselines. We apply IBCD to CRISPR perturbation (Perturb-seq) data on 521 genes, demonstrating that edge posterior inclusion probabilities enable identification of robust graph structures.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Mingming_Gong1
Submission Number: 9092
Loading