Benchmarking Self-Supervised Learning for Single-Cell Data

Philip Toma; Olga Ovcharenko; Imant Daunhawer; Julia E Vogt; Florian Barkmann; Valentina Boeva

Benchmarking Self-Supervised Learning for Single-Cell Data

Philip Toma, Olga Ovcharenko, Imant Daunhawer, Julia E Vogt, Florian Barkmann, Valentina Boeva

Published: 13 Oct 2024, Last Modified: 02 Dec 2024NeurIPS 2024 Workshop SSLEveryoneRevisionsBibTeXCC BY 4.0

Keywords: benchmark, single-cell RNA sequencing, batch integration, self-Supervised Learning

TL;DR: We benchmark several self-supervised learning methods for single-cell data analysis, identifying top-performing approaches and effective data augmentation techniques to guide best practices in the field.

Abstract: Self-supervised learning (SSL) has emerged as a powerful approach for learning biologically meaningful representations of single-cell data. To establish best practices in this domain, we present a comprehensive benchmark evaluating eight SSL methods across three downstream tasks and eight datasets, with various data augmentation strategies. Our results demonstrate that SimCLR and VICReg consistently outperform other methods across different tasks. Furthermore, we identify random masking as the most effective augmentation technique. This benchmark provides valuable insights into the application of SSL to single-cell data analysis, bridging the gap between SSL and single-cell biology.

Submission Number: 78

Loading