Keywords: Uncertainty Estimation, Missing Values
TL;DR: We seek to estimate uncertainty caused by missing values, and use that estimation alongside operator knowledge to determine when additional data is required to make a confident prediction.
Abstract: In high-stakes domains like healthcare, operators often face the critical decision of whether to act on incomplete information or incur costs to collect missing values. Existing methods typically focus on imputing missing data or quantifying model uncertainty, but they do not directly assess the stability of a prediction if missing features were to be revealed. To address this gap, we introduce a framework for Missing Value Uncertainty (MVU), which is the distribution of predictions induced by incomplete inputs. We formalize the problem by defining hard confidence: the probability that a prediction will not change after collecting the missing data. We propose a novel Direct Missing Value (DMV) to efficiently estimate the MVU distribution, bypassing the need for expensive Monte Carlo sampling. Second, we introduce the Missing Value Calibration Error (MVCE), a new metric specifically designed to evaluate the calibration of hard confidence values, and a post-hoc calibration procedure to improve MVU estimation. We showcase our method and metric on synthetic and real-world datasets.
Supplementary Material: zip
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Submission Number: 20139
Loading