RAPTORGraph: Graph-Based Pathway Modeling for Causal Discovery in Single-Cell Perturbations

Yeremia Gunawan Adhisantoso; Stephanie Kristin Schröder; Maximilian Greß; Mikel Hernaez; Jan Voges

RAPTORGraph: Graph-Based Pathway Modeling for Causal Discovery in Single-Cell Perturbations

Yeremia Gunawan Adhisantoso, Stephanie Kristin Schröder, Maximilian Greß, Mikel Hernaez, Jan Voges

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Causal Representation Learning, scRNA-seq, VAE, Generative Model, Perturb-seq, Interpretability

Abstract: Experiments involving the perturbation of individual cells are central to understanding cellular mechanisms and can accelerate drug discovery. Causal representation learning (CRL) allows us to uncover the latent factors that regulate biological systems and predict the impact of novel perturbations. Unfortunately, existing methods fail to address intervention spillover in a closed-world setting where intervention targets are known a priori, such as in Perturb-seq experiments, due to their reliance on dense encoders. Furthermore, incorporating curated biological pathways into the model imposes a confirmatory bias, forcing it to explain the data through preexisting pathways and reducing the set of hypotheses the model can explore, while discarding novel signals that lie outside the annotated pathways. In this work, we introduce RAPTORGraph, a $\beta$-VAE with a GraphPathway encoder that explicitly models complex gene-to-gene interactions within learned pathways. Moreover, our model's preconditioning isolates the influence of perturbed genes, yielding clean, single-node latent interventions required for identifiable causal discovery and eliminating spillover. Finally, we train the model on data preprocessed with optimal-transport alignment, which guarantees a well-defined mapping between control and perturbed samples and further stabilizes the learned latent representations. We demonstrate that RAPTORGraph improves state-of-the-art performance on downstream analyses of unseen perturbations, such as non-additive interactions, while outperforming other approaches on objective metrics, such as MSE and MK-MMD. The code will be made publicly available upon publication of this paper.

Supplementary Material: zip

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Submission Number: 11570

Loading