scDFM: Distributional Flow Matching Model for Robust Single-Cell Perturbation Prediction

ICLR 2026 Conference Submission2304 Authors

05 Sept 2025 (modified: 30 Nov 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Machine Learning, Single Cell
Abstract: A central goal in systems biology and drug discovery is to predict the transcriptional response of cells to perturbations. This task is challenging due to the noisy, sparse nature of single-cell measurements and the fact that perturbations often induce population-level shifts rather than changes in individual cells. Existing deep learning methods typically assume cell-level correspondences, limiting their ability to capture such global effects. We present **scDFM**, a generative framework based on conditional flow matching that models the full distribution of perturbed cells conditioned on control states. By incorporating an MMD objective, our method aligns perturbed and control populations beyond cell-level correspondences. To further improve robustness to sparsity and noise, we propose the Perturbation-Aware Differential Transformer architecture (PAD-Transformer), a backbone that leverages gene interaction graphs and differential attention to capture context-specific expression changes. **scDFM** outperforms prior methods across multiple genetic and drug perturbation benchmarks, excelling in both unseen and combinatorial settings. In the combinatorial setting, it reduces MSE by 19.6\% over the strongest baseline. These results highlight the importance of distribution-level generative modeling for robust $\textit{in silico}$ perturbation prediction.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 2304
Loading