Track: Tiny Paper Track
Keywords: Representation learning, SCVI, Drug screen, Perturbation, Cancer cell lines
TL;DR: We present an SCVI model pretrained on a dataset with 60,000 drug perturbation experiments.
Abstract: We present a pre-trained SCVI model for the Tahoe-100M single-cell dataset, enabling large-scale single-cell analyses on systems with limited GPU memory. By compressing expression profiles from over 95 million cells into a 42 GB “minified” file plus a 1 GB model, this approach preserves essential biological signals while remaining practical for routine exploratory tasks. The openly available model supports downstream analyses such as differential expression and method development, without requiring access to the entire raw dataset.
Attendance: Valentine Svensson
Submission Number: 39
Loading