Learning representations of cell populations for image-based profiling using contrastive learning

Robert Van Dijk; John Arevalo; Shantanu Singh; Anne E Carpenter

Learning representations of cell populations for image-based profiling using contrastive learning

Robert Van Dijk, John Arevalo, Shantanu Singh, Anne E Carpenter

Published: 28 Nov 2022, Last Modified: 05 May 2023LMRL 2022 PosterReaders: Everyone

Keywords: image-based cell profiling, contrastive learning, representation learning, deep sets, aggregation

TL;DR: A novel learning-based method that automatically finds an effective way to aggregate single-cell data to improve the strength of image-based cell profiles.

Abstract: Image-based cell profiling is a powerful tool that compares differently perturbed cell populations by measuring thousands of single-cell features and summarizing them into vectors (or profiles). Despite its simplicity, so-called average profiling, where all single-cell features are averaged using measures of center, is still the most commonly used approach. However, this method fails to capture cell populations’ heterogeneity, which has been shown to improve the phenotypic strength of profiles. A recent study proposed a method that did capture cell population heterogeneity, but their method is difficult to use in practice. Therefore, we propose a Deep Sets based method that learns the most effective way of aggregating single-cell feature data into a profile that better predicts a compound’s mechanism of action compared to average profiling. This is achieved by applying weakly supervised contrastive learning in a multiple instance learning setting. Our proposed model provides a more accessible and better performing method for aggregating single-cell feature data than previously published strategies and the average profiling baseline. It is likely that the model achieves this by performing some form of quality control by filtering out noisy cells and prioritizing less noisy cells. The model cannot be directly transferred to unseen batch data; however, it can readily be used by training on new data and inferring the improved profiles directly after because the labels required for training are naturally available in cell profiling experiments. The application of this method could help improve the effectiveness of future cell profiling studies.

0 Replies

Loading