Improving the Statistical Efficiency of Cross-Conformal Prediction

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Vovk (2015) introduced cross-conformal prediction, a modification of split conformal designed to improve the width of prediction sets. The method, when trained with a miscoverage rate equal to $\alpha$ and $n \gg K$, ensures a marginal coverage of at least $1 - 2\alpha - 2(1-\alpha)(K-1)/(n+K)$, where $n$ is the number of observations and $K$ denotes the number of folds. A simple modification of the method achieves coverage of at least $1-2\alpha$. In this work, we propose new variants of both methods that yield smaller prediction sets without compromising the latter theoretical guarantees. The proposed methods are based on recent results deriving more statistically efficient combination of p-values that leverage exchangeability and randomization. Simulations confirm the theoretical findings and bring out some important tradeoffs.
Lay Summary: Imagine a tool that not only makes predictions but also tells you how confident it is in those predictions. Conformal prediction methods do exactly that. Instead of providing a single outcome, they return a set of possible values that is guaranteed to contain the correct answer with a specified level of confidence. This allows users to better understand the uncertainty behind a machine learning model’s output in a clear and reliable way. This research focuses on enhancing a particular version of this approach known as cross-conformal prediction. While this method already produces prediction intervals that meet the desired coverage --- meaning they include the true value with high probability --- it can sometimes be inefficient, resulting in prediction sets that are unnecessarily wide. We introduce new variants that preserve the same level of reliability while producing narrower and more informative sets. These improvements result from more efficient methods for combining p-values, which are a fundamental tool in statistical inference. In practical terms, these advancements make cross-conformal prediction more usable by providing tighter prediction sets. This can be especially valuable in real-world applications, such as industrial settings, where thousands of predictions are made each day and decision-making depends on accurate information.
Link To Code: https://github.com/matteogaspa/EffCrossCP
Primary Area: Probabilistic Methods
Keywords: Conformal Prediction, Exchangeability, Randomization
Submission Number: 6897
Loading