Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We introduce a unified framework for Venn and Venn-Abers calibration for generic prediction tasks and loss functions, including conformal prediction as a special case.
Abstract: Ensuring model calibration is critical for reliable prediction, yet popular distribution-free methods such as histogram binning and isotonic regression offer only asymptotic guarantees. We introduce a unified framework for Venn and Venn-Abers calibration that extends Vovk's approach beyond binary classification to a broad class of prediction tasks defined by generic loss functions. Our method transforms any perfectly in-sample calibrated predictor into a set-valued predictor that, in finite samples, outputs at least one marginally calibrated point prediction. These set predictions shrink asymptotically and converge to a conditionally calibrated prediction, capturing epistemic uncertainty. We further propose Venn multicalibration, a new approach for achieving finite-sample calibration across subpopulations. For quantile loss, our framework recovers group-conditional and multicalibrated conformal prediction as special cases and yields novel prediction intervals with quantile-conditional coverage.
Lay Summary: To be trustworthy, machine learning models must be well-calibrated—that is, when a model predicts an 80\% chance of an event, that event should occur about 80\% of the time. Traditional calibration methods, such as histogram binning and isotonic regression, adjust model outputs to align with observed outcomes, but they only guarantee reliable performance when applied to large datasets. We introduce a new approach that generalizes Vovk’s method—originally developed for binary classification—to a broad class of prediction problems. Specifically, our framework extends calibration to any elicitable property defined via the minimization of a loss function, including but not limited to probabilities and quantiles. Instead of producing a single point prediction, our method outputs a set of possible predictions for each input. This set is guaranteed to contain at least one prediction that is well-calibrated, even in small-sample settings. As more data becomes available, the prediction sets shrink, eventually converging to a single, conditionally calibrated prediction. Moreover, we extend this approach to ensure calibration holds across different subpopulations, ensuring fairness and reliability. Overall, our method provides a unified and practical solution for capturing and communicating uncertainty in model predictions, especially when data is limited or decisions are high-stakes.
Link To Code: https://github.com/Larsvanderlaan/VennCalibration
Primary Area: General Machine Learning->Everything Else
Keywords: Calibration, multicalibration, Venn-Abers, conformal prediction, isotonic calibration, distribution-free, epistemic uncertainty
Submission Number: 3932
Loading