Keywords: benchmark, single-cell RNA sequencing, batch integration, self-Supervised Learning
TL;DR: We benchmark several self-supervised learning methods for single-cell data analysis, identifying top-performing approaches and effective data augmentation techniques to guide best practices in the field.
Abstract: Self-supervised learning (SSL) has emerged as a powerful approach for learning biologically meaningful representations of single-cell data. To establish best practices in this domain, we present a comprehensive benchmark evaluating eight SSL methods across three downstream tasks and eight datasets, with various data augmentation strategies. Our results demonstrate that SimCLR and VICReg consistently outperform other methods across different tasks. Furthermore, we identify random masking as the most effective augmentation technique. This benchmark provides valuable insights into the application of SSL to single-cell data analysis, bridging the gap between SSL and single-cell biology.
Submission Number: 78
Loading