Scaling up measurement noise scaling laws

Scaling up measurement noise scaling laws

ICML 2025 Workshop FM4LS Submission53 Authors

Published: 12 Jul 2025, Last Modified: 12 Jul 2025FM4LS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: scaling laws, scRNA-seq, data quality

Abstract: Learning meaningful representations of cellular states is a key problem in computational biology. Yet, the scaling behavior of single-cell representation learning models remains poorly understood. While recent work has proposed that model performance scales predictably with measurement noise, this hypothesis has only been validated with relatively small models and datasets. In this work-in-progress, we present the first empirical evidence supporting measurement noise scaling laws at large scales using datasets on the order of $10^7$ cells and transformer-based models with $>10^7$ parameters. We demonstrate that previously observed noise-scaling behavior again consistently emerge in these large-scale models and datasets. Our results provide further evidence that measurement noise is an important scaling axis for cellular representation learning.

Submission Number: 53

Loading