Is it worth it to collect missing values?: The Missing Value Uncertainty Problem

Is it worth it to collect missing values?: The Missing Value Uncertainty Problem

ICLR 2026 Conference Submission20139 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Uncertainty Estimation, Missing Values

TL;DR: We seek to estimate uncertainty caused by missing values, and use that estimation alongside operator knowledge to determine when additional data is required to make a confident prediction.

Abstract: In high-stakes domains like healthcare, operators often face the critical decision of whether to act on incomplete information or incur costs to collect missing values. Existing methods typically focus on imputing missing data or quantifying model uncertainty, but they do not directly assess the stability of a prediction if missing features were to be revealed. To address this gap, we introduce a framework for Missing Value Uncertainty (MVU), which is the distribution of predictions induced by incomplete inputs. We formalize the problem by defining hard confidence: the probability that a prediction will not change after collecting the missing data. We propose a novel Direct Missing Value (DMV) to efficiently estimate the MVU distribution, bypassing the need for expensive Monte Carlo sampling. Second, we introduce the Missing Value Calibration Error (MVCE), a new metric specifically designed to evaluate the calibration of hard confidence values, and a post-hoc calibration procedure to improve MVU estimation. We showcase our method and metric on synthetic and real-world datasets.

Supplementary Material: zip

Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)

Submission Number: 20139

Loading