OLIVES Dataset: Ophthalmic Labels for Investigating Visual Eye SemanticsDownload PDF

Published: 17 Sept 2022, Last Modified: 12 Mar 2024NeurIPS 2022 Datasets and Benchmarks Readers: Everyone
Keywords: Ophthamology datasets, Biomarker analysis, Treatment prediction, Self-supervised learning
TL;DR: We propose a first-of-its-kind dataset that combines clinical labels, biomarkers, fundus, OCT scans, for disease prediction, treatment analysis and biomarker detection.
Abstract: Clinical diagnosis of the eye is performed over multifarious data modalities including scalar clinical labels, vectorized biomarkers, two-dimensional fundus images, and three-dimensional Optical Coherence Tomography (OCT) scans. Clinical practitioners use all available data modalities for diagnosing and treating eye diseases like Diabetic Retinopathy (DR) or Diabetic Macular Edema (DME). Enabling usage of machine learning algorithms within the ophthalmic medical domain requires research into the relationships and interactions between all relevant data over a treatment period. Existing datasets are limited in that they neither provide data nor consider the explicit relationship modeling between the data modalities. In this paper, we introduce the Ophthalmic Labels for Investigating Visual Eye Semantics (OLIVES) dataset that addresses the above limitation. This is the first OCT and near-IR fundus dataset that includes clinical labels, biomarker labels, disease labels, and time-series patient treatment information from associated clinical trials. The dataset consists of 1268 near-IR fundus images each with at least 49 OCT scans, and 16 biomarkers, along with 4 clinical labels and a disease diagnosis of DR or DME. In total, there are 96 eyes' data averaged over a period of at least two years with each eye treated for an average of 66 weeks and 7 injections. We benchmark the utility of OLIVES dataset for ophthalmic data as well as provide benchmarks and concrete research directions for core and emerging machine learning paradigms within medical image analysis.
Author Statement: Yes
URL: Dataset: https://doi.org/10.5281/zenodo.7105232, Benchmark: https://github.com/olivesgatech/OLIVES_Dataset
Dataset Url: Dataset: https://doi.org/10.5281/zenodo.7105232
License: The dataset is released under Creative Commons Attribution 4.0 International. The code is released under MIT License.
Supplementary Material: pdf
Contribution Process Agreement: Yes
In Person Attendance: Yes
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/arxiv:2209.11195/code)
17 Replies