Where Does Prediction Error Come From When the Data Is Perfect? A Decomposition of the Model–World Gap in Predictive Uncertainty

Published: 04 Jun 2026, Last Modified: 08 Jun 2026PhilML@ICML 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: uncertainty, decomposition, aleatoric
TL;DR: We develop a layered decomposition of the model--world gap in predictive uncertainty, distinguishing five structurally distinct error sources that persist even under clean random sampling from the target distribution.
Abstract: Most discussions of predictive uncertainty in machine learning focus on data problems, e.g. finite samples, measurement error or distribution shift, as the dominant sources of error. We argue that uncertainty in predictions is structured, even when the analyst has access to large random samples from the target distribution. We refer to this as the model--world gap and, drawing together threads from the statistical, sociological, and ML uncertainty-quantification literatures, develop a layered decomposition that distinguishes five distinct error sources: aleatoric variability, concept-induced inflation, hypothesis class misspecification, asymptotic estimator bias, and finite-sample error. The framework provides a conceptual foundation for thinking about predictive uncertainty independently of data quality, a first step toward more principled diagnosis and mitigation of prediction error in high-stakes applications.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 58
Loading