Expert-Guided Cross-View Fusion with Self-Derived Lesion Proposals for Multi-View Diabetic Retinopathy Grading
Keywords: multi-view learning, diabetic retinopathy (DR) grading, medical image processing
Abstract: Recent advances in multi-view fundus imaging show great promise for automated diabetic retinopathy (DR) grading. However, mainstream end-to-end CNN/Transformer pipelines rely on striding or tokenization that compresses spatial detail, causing small, low-contrast lesions (e.g., microaneurysms) to be under-represented and creating performance ceilings. Prior efforts have mitigated this by incorporating external lesion- or vessel-level annotations into models. However, such labels are costly to acquire, break the end-to-end training, and make performance over-reliant on the annotation quality. To reduce dependence on expensive annotations, we propose an end-to-end framework that generates lesion proposals on the fly during training and inference, providing self-derived cues for grading. First, we introduce a Grade-Activated Lesion Proposal (GALP) module that derives grade-conditioned evidence maps (GEMs) from stage-wise auxiliary classifiers and selects the top-K high-evidence regions per view as lesion proposals. Second, we propose a Cross-View Lesion Expert Guided Regional Fusion (LGRF) module, which selectively activates experts for a view’s lesion proposals based on contextual guidance from other views, ensuring that only the most relevant feature extractors contribute to fusion. Experimental results on two multi-view DR datasets show that our method matches or surpasses strong baselines without external annotations, confirming that self-generated proposals can substantially reduce annotation needs.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 3129
Loading