Keywords: conformal prediction, distribution shifts, subpopulation shifts, uncertainty quantification
TL;DR: Conformal prediction fails under distribution shifts. We show how to provably adapt it to handle unknown subpopulation shifts.
Abstract: Conformal prediction is widely used to equip black-box machine learning models with uncertainty quantification, offering formal coverage guarantees under exchangeable data. However, these guarantees fail when faced with subpopulation shifts, where the test environment contains a different mix of subpopulations than the calibration data.
In this work, we focus on *unknown* subpopulation shifts where we are not given group-information i.e. the subpopulation labels of datapoints have to be inferred.
We propose new methods that provably adapt conformal prediction to such shifts, ensuring valid coverage without explicit knowledge of subpopulation structure.
While existing methods in similar setups assume perfect subpopulation labels, our framework explicitly relaxes this requirement and characterizes conditions where formal coverage guarantees remain feasible.
Further, our algorithms scale to high-dimensional settings and remain practical in realistic machine learning tasks. Extensive experiments on vision (with vision transformers) and language (with large language models) benchmarks demonstrate that our methods reliably maintain coverage and effectively control risks in scenarios where standard conformal prediction fails.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 23126
Loading