Keywords: single-cell annotation, out-of-distribution detection, hybrid data modelling, energy-based model
TL;DR: We introduce energy-based models for scRNA-seq annotation and OOD detection
Abstract: Single-cell sequencing has provided profound insights into understanding heterogeneous cellular activities by measuring sequence information at the individual cell resolution. Accurately annotating a single-cell RNA sequencing (scRNA-seq) dataset is a crucial step for the single-cell data analysis pipeline. In particular, previously unobserved cell types and cellular states frequently appear in scRNA-seq experiments and carry valuable information. This highlights the need for reliable annotation tools with out-of-distribution (OOD) detection capability. Recent advances in energy-based modelling have made it possible to design and deploy EBMs for joint discriminative and generative tasks. In this work, we introduce energy-based models (EBMs) for scRNA-seq annotation and investigate generative modelling for OOD detection, which results in more accurate, calibrated, and robust cell type predictions. Specifically, we developed CLAMS, an EBM instance based on the joint energy-based model (JEM), for single-cell data hybrid modelling. Our experiments revealed that hybrid modelling with EBMs maintains the strong discriminative power of baseline classifiers and outperforms the state-of-the-art by integrating generative capabilities in data annotation and OOD detection tasks. In addition, we provided a diagnosis of JEM training and proposed effective regularization methods to boost JEM's performance. To the best of our knowledge, we are the first to apply EBMs for single-cell data modelling.
0 Replies
Loading