M$^3$Ret: A Mixed Multimodal Image Dataset and Benchmark for Personalized Multi-Retinal Disease Detection

ICLR 2026 Conference Submission18816 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: multimudal learning, personalized multi-retinal disease detection, multimodal ophthalmic imaging dataset
TL;DR: this paper proposes multimodal ophthalmic imaging dataset with multiple modality combinations and benchmark for personalized multi-retinal disease detection
Abstract: In ophthalmic clinical practice, various imaging examinations, such as retinal fundus photography and OCT imaging, provide ophthalmologists with non-invasive methods to assess the condition of the retina and highlighting the importance of multimodal data. The imaging examinations are individually tailored according to each patient’s clinical condition, resulting in diverse modality combinations. However, existing multimodal ophthalmic imaging datasets only collected one combination of multimodal data for single disease detection. Correspondingly, previous multimodal models were designed to learn from a fixed combination of modalities, overlooking the personalized nature of clinical examinations and the variability in modality combinations. As a result, the models often fail to generalize well to real-world clinical applications. To bridge the gap, this paper proposes (1) $\mathbf{\mathsf{M^3Ret}}$, a $\textbf{M}$ixed $\textbf{M}$ultimodal ophthalmic imaging dataset for personalized $\textbf{M}$ulti-$\textbf{Ret}$inal disease detection, which consists of scanning laser ophthalmoscopy (SLO) images and optical coherence tomography (OCT) images and includes various modality combinations, and (2) $\mathbf{\mathsf{PersonNet}}$, a new baseline model for personalized multimodal multi-retinal disease detection, which can handles samples with various modality combinations during both training and inference phase, (3) benchmark results of our $\mathsf{PersonNet}$ and 13 existing multimodal learning methods, which demonstrate the superiority of the proposed $\mathsf{PersonNet}$ and highlight the significant room for improvement before clinical application can be achieved.
Primary Area: datasets and benchmarks
Submission Number: 18816
Loading