Keywords: Retinal Fundus Images, Glaucoma Classification, Self-Supervised Learning
Abstract: Self-supervised learning (SSL) offers a powerful paradigm for medical image representation
learning, particularly in low-label regimes. However, standard pretext tasks often over-
look domain-specific cues vital for diseases like glaucoma—a leading cause of irreversible
blindness that manifests as subtle structural changes in the optic disc (OD) region. Un-
derstanding the broader retinal context is essential, yet traditional models tend to overfit
to localized features, limiting generalizability. We propose a glaucoma-aware SSL frame-
work using a Deconvolutional Masked Autoencoder (Deconv-MAE) with a ViT-B encoder,
trained to reconstruct clean fundus images from inputs degraded by Gaussian noise and
anatomically-aware OD masking. This lesion-focused corruption compels the model to learn
robust, context-rich representations. Pretrained on EYEPACS and fine-tuned on ORIGA-
light, our method outperforms both standard MAE and supervised baselines, highlighting
the value of anatomically informed pretext tasks in retinal diagnostics.
Submission Number: 120
Loading