Learning Retinal Representations from Multi-modal Imaging via Contrastive Pre-training

Emese Sükei; Elisabeth Rumetshofer; Niklas Schmidinger; Ursula Schmidt-Erfurth; Günter Klambauer; Hrvoje Bogunović

Learning Retinal Representations from Multi-modal Imaging via Contrastive Pre-training

Emese Sükei, Elisabeth Rumetshofer, Niklas Schmidinger, Ursula Schmidt-Erfurth, Günter Klambauer, Hrvoje Bogunović

Published: 28 Apr 2023, Last Modified: 30 May 2023MIDL 2023 Short paper track PosterReaders: Everyone

Keywords: contrastive learning, predictive modelling, multi-modal imaging, retina

TL;DR: We propose a novel multi-modal contrastive learning-based approach for representation learning of retinal imaging.

Abstract: Contrastive representation learning techniques trained on large multi-modal datasets, such as CLIP and CLOOB, have demonstrated impressive capabilities of producing highly transferable representations for different downstream tasks. In the field of ophthalmology, large multi-modal datasets are conveniently accessible as retinal imaging scanners acquire both 2D fundus images and 3D optical coherence tomography to evaluate the disease. Motivated by this, we propose a CLIP/CLOOB objective-based model to learn joint representations of the two retinal imaging modalities. We evaluate our model's capability to accurately retrieve the appropriate OCT based on a fundus image belonging to the same eye. Furthermore, we showcase the transferability of the obtained representations by conducting linear probing and fine-tuning on several prediction tasks from OCT.

2 Replies

Loading