Robust Conformal Prediction under Joint Distribution Shift

15 May 2024 (modified: 06 Nov 2024)Submitted to NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: conformal prediction, data distribution shift, coverage robustness
TL;DR: We propose Normalized Truncated Wasserstein distance of conformal score CDFs to capture coverage difference of conformal prediction (CP) under data shift and develop Multi-domain Robust CP to make coverages approach confidences in all test domains.
Abstract: Uncertainty prevails due to the lack of knowledge about data or model, and conformal prediction (CP) predicts multiple potential targets, hoping to cover the true target with a high probability. Regarding CP robustness, importance weighting can address covariate shifts, but CP under joint distribution shifts remains more challenging. Prior attempts addressing joint shift via $f$-divergence ignores the nuance of calibration and test distributions that are critical for coverage guarantees. More generally, with multiple test distributions shifted from the calibration distribution, simultaneous coverage guarantees for all test domains requires a new paradigm. We design Multi-domain Robust Conformal Prediction (mRCP) that first formulates the coverage difference that importance weighting fails to capture under any joint shift. To squeeze such coverage difference and guarantee the $(1-\alpha)$ coverage in all test domains, we propose Normalized Truncated Wasserstein distance (NTW) to comprehensively capture the nuance of any test and calibration conformal score distributions, and design an end-to-end training algorithm incorporating NTW to provide elasticity for simultaneous coverage guarantee over distinct test domains. With diverse tasks (seven datasets) and architectures (black-box and physics-informed models), NTW strongly correlates (Pearson coefficient=0.905) with coverage differences beyond covariate shifts, while mRCP reduces coverage gap by 50% on average robustly over multiple distinct test domains.
Primary Area: Probabilistic methods (for example: variational inference, Gaussian processes)
Submission Number: 16141
Loading