PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis

Yan Wu; Esther Wershof; Sebastian M Schmon; Marcel Nassar; Błażej Osiński; Ridvan Eksi; Zichao Yan; Rory Stark; Kun Zhang; Thore Graepel

PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis

Yan Wu, Esther Wershof, Sebastian M Schmon, Marcel Nassar, Błażej Osiński, Ridvan Eksi, Zichao Yan, Rory Stark, Kun Zhang, Thore Graepel

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Perturbation response modelling, single-cell RNA sequencing, variational autoencoder, variational inference, generative modelling, perturb-seq, benchmarking

TL;DR: We introduce PerturBench: a comprehensive model development and benchmarking framework for perturbation response modeling in single cells.

Abstract: We introduce a comprehensive framework for modeling single cell transcriptomic responses to perturbations, aimed at standardizing benchmarking in this rapidly evolving field. Our approach includes a modular and user-friendly model development and evaluation platform, a collection of diverse perturbational datasets, and a set of metrics designed to fairly compare models and dissect their performance. Through extensive evaluation of both published and baseline models across diverse datasets, we highlight the limitations of widely used models, such as mode collapse. We also demonstrate the importance of rank metrics which complement traditional model fit measures, such as RMSE, for validating model effectiveness. Notably, our results show that while no single model architecture clearly outperforms others, simpler architectures are generally competitive and scale well with larger datasets. Overall, this benchmarking exercise sets new standards for model evaluation, supports robust model development, and furthers the use of these models to simulate genetic and chemical screens for therapeutic discovery.

Croissant File: json

Dataset URL: https://huggingface.co/datasets/altoslabs/perturbench

Code URL: https://github.com/altoslabs/perturbench/

Supplementary Material: zip

Primary Area: AL/ML Datasets & Benchmarks for life sciences (e.g. climate, health, life sciences, physics, social sciences)

Submission Number: 2548

Loading