Keywords: conformal prediction, uncertainty quantification, data curation
Abstract: We consider the problem of conformalising a fixed, pretrained predictor for deployment. Conformal prediction ensures finite--sample marginal coverage, but requires a labelled calibration set exchangeable with the test distribution. When labels are expensive, calibration data must be acquired selectively from an unlabelled pool. This creates a tension: adaptive curation policies often violate exchangeability, while naive random sampling is inefficient or miscalibrated under covariate shift. We propose a simple acquisition strategy under a fixed label budget using density--ratio rejection sampling. Using only unlabelled covariates, we estimate the target--to--pool density ratio and sample points proportional to this ratio, yielding calibration pairs i.i.d. from the target distribution. This enables standard conformal prediction without the effective--sample--size loss associated with weighted conformal methods. Synthetic regression experiments on covariate--shifted pools demonstrate valid coverage and significantly tighter prediction intervals.
Submission Number: 5
Loading