A Framework for Assessing Joint Human-AI Systems Based on Uncertainty Estimation

Emir Konuk, Robert Welch, Filip Christiansen, Elisabeth Epstein, Kevin Smith

Published: 2024, Last Modified: 23 Feb 2026MICCAI (10) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We investigate the role of uncertainty quantification in aiding medical decision-making. Existing evaluation metrics fail to capture the practical utility of joint human-AI decision-making systems. To address this, we introduce a novel framework to assess such systems and use it to benchmark a diverse set of confidence and uncertainty estimation methods. Our results show that certainty measures enable joint human-AI systems to outperform both standalone humans and AIs, and that for a given system there exists an optimal balance in the number of cases to refer to humans, beyond which the system’s performance degrades.