Beyond Mean Shifts: Predicting Distributional Responses to Unseen Genetic Perturbations

Kalyan Ramakrishnan; Jonathan G. Hedley; Sisi Qu; Puneet K. Dokania; Philip Torr; Cesar A. Prada-Medina; Julien Fauqueur; Kaspar Märtens

Beyond Mean Shifts: Predicting Distributional Responses to Unseen Genetic Perturbations

Kalyan Ramakrishnan, Jonathan G. Hedley, Sisi Qu, Puneet K. Dokania, Philip Torr, Cesar A. Prada-Medina, Julien Fauqueur, Kaspar Märtens

Published: 02 Mar 2026, Last Modified: 17 Apr 2026MLGenX 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We introduce a simple, histogram-based approach for predicting distributional responses in gene expression following genetic perturbations. This is an essential task in early-stage drug discovery, where such responses can offer insights into gene function and inform target identification. Existing methods only optimize for changes in the mean expression, overlooking stochasticity inherent in single-cell data. We instead model per-gene expression distributions, predicting histograms conditioned on perturbations. This captures higher-order statistics (variance, skewness, kurtosis), where our method outperforms baselines at a fraction of the training cost. To generalize to unseen perturbations, we incorporate prior knowledge via gene embeddings from large language models (LLMs). While modeling a richer output space, the method remains competitive in predicting mean expression changes. This work demonstrates that explicitly modeling distributional responses yields richer biological insights while remaining practical and efficient.

Track: Main track

AI Policy Confirmation: I confirm that this submission clearly discloses the role of AI systems and human contributors and complies with the ICLR 2026 Policies on Large Language Model Usage and the ICLR Code of Ethics.

Submission Number: 50

Loading