Self-supervised Visualisation of Microscopy Datasets

TMLR Paper3288 Authors

04 Sept 2024 (modified: 15 Nov 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Self-supervised learning methods based on data augmentations, such as SimCLR, BYOL, or DINO, allow obtaining semantically meaningful representations of image datasets and are widely used prior to supervised fine-tuning. A recent self-supervised learning method, $t$-SimCNE, uses contrastive learning to directly train a 2D representation suitable for visualisation. When applied to natural image datasets, $t$-SimCNE yields 2D visualisations with semantically meaningful clusters. In this work, we used $t$-SimCNE to visualise medical image datasets, including examples from dermatology, histology, and blood microscopy. We found that increasing the set of data augmentations to include arbitrary rotations improved the results in terms of class separability, compared to data augmentations used for natural images. Our 2D representations show medically relevant structures and can be used to aid data exploration and annotation, improving on common approaches for data visualisation.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Emanuele_Sansone1
Submission Number: 3288
Loading