Quantifying Regional Contributions to Sex Classification from Fundus Photographs via a Two-Stage Attention-Based Deep Learning Approach
Keywords: fundus photograph, sex classification, explainable AI, attention mechanism, retinal imaging
TL;DR: Beyond qualitative saliency: a multi-branch attention framework that quantifies region-wise retinal contributions to sex classification.
Registration Requirement: Yes
Abstract: Existing explainability methods for fundus-based sex classification rely on qualitative saliency visualizations, making it difficult to objectively quantify the proportional contribution of anatomical regions. We propose a two-stage multi-branch framework using pre-trained ResNet50 backbones for region-specific feature extraction from three predefined retinal ROIs (macula, optic disc, and vasculature), combined with an attention-based fusion module that produces ROI-level scalar weights. In a single-center dataset of 3,478 eye-level fundus images (1,973 subjects), the fusion model achieved the highest discriminative performance (AUC = 0.861; 95\% CI: 0.826--0.895), significantly outperforming all single-branch models ($p \leq 0.038$). The fusion mechanism assigned comparable weights to the macula (0.408; 95\% CI: 0.369--0.446) and optic disc (0.383; 95\% CI: 0.347--0.418; $p = 0.506$), both significantly higher than the vasculature (0.209; 95\% CI: 0.184--0.238; $p < 0.001$). These quantitative, reproducible region-level weights offer a methodological step toward anatomically interpretable explainability in fundus-based classification.
Visa & Travel: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 118
Loading