Making Deep Learning Models Clinically Useful - Improving Diagnostic Confidence in Inherited Retinal Disease with Conformal Prediction

03 Aug 2024 (modified: 01 Sept 2024)MICCAI 2024 Workshop UNSURE SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Deep learning, uncertainty, conformal prediction, IRD
Abstract: Deep Learning (DL), which involves powerful “black box” predictors, has achieved state-of-the-art performance in medical image analysis. However, these methods lack transparency and interpretability of point predictions without assessing the quality of their outputs. Knowing how much confidence there is in a prediction is essential for gaining clinicians’ trust in the technology and its use in medical decision-making. In this paper, we explore the use of conformal prediction methods to recommend statistically rigorous reliable prediction sets to a clinician, using multi-modal imaging for the genetic diagnosis of the 36 most common molecular causes of inherited retinal diseases (IRDs). These are monogenic conditions that represent a leading cause of blindness in children and working-age adults and require a costly and time-consuming genetic test for diagnosis. Our IRD classifier (Eye2Gene) was trained on 44,817 retinal scans from the XXX YYY Hospital dataset, and the conformal predictor was calibrated on a further 13,012 scans. Three methods of CP were assessed: Least Ambiguous Adaptive Prediction Sets (LAPS), Adaptative Prediction Sets (APS), and Regularized Adaptive Prediction Sets (RAPS). Eye2Gene, in combination with the three conformal predictors, was evaluated on an internal holdout subset and datasets from four external clinical centres, totalling a test set of 3,033 retinal scans. RAPS proved to be the best-performing method with single-digit set sizes and coverage above 90% at a confidence level of 80%. Implementing adaptive CP methods has the potential to reduce waiting time and costs of genetic diagnosis of IRDs by improving upon the current gene prioritisation systems, while simultaneously enabling safety-critical clinical environments by flagging clinicians for a second opinion.
Submission Number: 5
Loading