A Standardized Framework For Evaluating Gene Expression Generative Models

Andrea Rubbi; Andrea Giuseppe Di Francesco; Mohammad Lotfollahi; Pietro Lio

A Standardized Framework For Evaluating Gene Expression Generative Models

Andrea Rubbi, Andrea Giuseppe Di Francesco, Mohammad Lotfollahi, Pietro Lio

Published: 02 Mar 2026, Last Modified: 10 Mar 2026Gen² 2026 PosterTop10EveryoneRevisionsCC BY 4.0

Track: Full / long paper (5-8 pages)

Keywords: Perturbation Modeling, Generative AI, Evaluation

TL;DR: We present GGE, a standardized, transparent, and biologically meaningful way to evaluate single-cell generative models; making reported metrics comparable and reproducible across methods and papers.

Abstract: The rapid development of generative models for single-cell gene expression data has created an urgent need for standardised evaluation frameworks. Current evaluation practices suffer from inconsistent metric implementations, incomparable hyperparameter choices, and a lack of biologically-grounded metrics. We present Generated Genetic Expression Evaluator (GGE), an open-source Python framework that addresses these challenges by providing a comprehensive suite of distributional metrics with explicit computation space options and biologically-motivated evaluation through differentially expressed gene (DEG)-focused analysis and perturbation-effect correlation, enabling standardized reporting and reproducible benchmarking. Through extensive analysis of the single-cell generative modeling literature, we identify that no standardized evaluation protocol exists. Methods report incomparable metrics computed in different spaces with different hyperparameters. We demonstrate that metric values vary substantially depending on implementation choices, highlighting the critical need for standardization. GGE enables fair comparison across generative approaches and accelerates progress in perturbation response prediction, cellular identity modeling, and counterfactual inference.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 27

Loading