Keywords: expert gaze data, self-supervised learning, supervised contrastive learning, glaucoma, optical coherence tomograpy
TL;DR: We use eye gaze of clinicians viewing medical images to train a transformer, generating 'pseudo-labels' for downstream weakly supervised contrastive learning of eye disease diagnosis with high accuracy.
Abstract: In this work, we address the challenge of limited data availability common in healthcare
settings by using clinician (ophthalmologist) gaze data on optical coherence tomography
(OCT) report images as they diagnose glaucoma, a top cause of blindness world-wide.
We use gaze data in two ways: first, we perform self-supervised pre-training via Sim-
CLR followed by supervised fine-tuning with gaze-overlaid OCT reports; second, we directly
learn gaze representations with our ‘GazeFormer’ model to generate pseudo-labels
using a multi-task objective. We use these pseudo-labels for weakly supervised contrastive learning
to detect glaucoma from a partially-labeled dataset of OCT report images. We find
that self-supervised pre-training with gaze-overlaid images significantly improves glaucoma
classification accuracy. Our natural language inspired region-based encoding baseline and
GazeFormer model pseudo-labels enable glaucoma detection accuracy exceeding 90\% even
with only partially-labeled data.
Submission Type: Full Paper
Supplementary Material: zip
Submission Number: 22
Loading