Class-Dependent Miscalibration Severely Degrades Selective Prediction in Multimodal Clinical Prediction Models

Published: 27 Nov 2025, Last Modified: 28 Nov 2025ML4H 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Calibration; Multimodal learning; Selective prediction
TL;DR: We study calibration in multimodal multilabel clinical classification with EHR + X-rays. Selective prediction shows miscalibration hides unsafe errors or adds workload. Aggregate metrics (AUC) mask this; per-label selective evaluation is essential.
Track: Findings
Abstract: As artificial intelligence systems transition from research to clinical deployment, ensuring their reliability becomes critical for clinical decision-making tasks, as incorrect predictions can have serious consequences. Deploying AI in healthcare therefore requires prediction systems with robust safeguards against error, such as selective prediction, where uncertain predictions are deferred to human experts for review. In this study, we carefully evaluate the reliability of uncertainty-based selective prediction for multilabel clinical condition classification using multimodal data. Our findings show that models often exhibit severe class-dependent miscalibration causing predictive performance to degrade under uncertainty-guided selective prediction---attributing high uncertainty to correct predictions and low uncertainty to incorrect predictions. Our evaluation highlights fundamental shortcomings of commonly used evaluation metrics for clinical AI. To address these shortcomings, we propose practical recommendations for calibration-aware model assessment and selective prediction design, offering a pathway to safer, more reliable AI systems that clinicians and patients can trust.
General Area: Applications and Practice
Specific Subject Areas: Uncertainty & Distribution Shift, Explainability & Interpretability, Evaluation Methods & Validity
PDF: pdf
Data And Code Availability: No
Ethics Board Approval: No
Entered Conflicts: I confirm the above
Anonymity: I confirm the above
Code URL: https://github.com/jlaitue/medcertain
Submission Number: 41
Loading