Direct Prediction Set Minimization via Bilevel Conformal Classifier Training

Yuanjie Shi; Hooman Shahrokhi; Xuesong Jia; Xiongzhi Chen; Jana Doppa; Yan Yan

Direct Prediction Set Minimization via Bilevel Conformal Classifier Training

Yuanjie Shi, Hooman Shahrokhi, Xuesong Jia, Xiongzhi Chen, Jana Doppa, Yan Yan

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

TL;DR: We propose the Direct Prediction Set Minimization (DPSM) approach by integrating the principle behind the standard quantile regression with a differentiable measure of the prediction set size that is conditioned on the learned quantile.

Abstract: Conformal prediction (CP) is a promising uncertainty quantification framework which works as a wrapper around a black-box classifier to construct prediction sets (i.e., subset of candidate classes) with provable guarantees. However, standard calibration methods for CP tend to produce large prediction sets which makes them less useful in practice. This paper considers the problem of integrating conformal principles into the training process of deep classifiers to directly minimize the size of prediction sets. We formulate conformal training as a bilevel optimization problem and propose the {\em Direct Prediction Set Minimization (DPSM)} algorithm to solve it. The key insight behind DPSM is to minimize a measure of the prediction set size (upper level) that is conditioned on the learned quantile of conformity scores (lower level). We analyze that DPSM has a learning bound of $O(1/\sqrt{n})$ (with $n$ training samples), while prior conformal training methods based on stochastic approximation for the quantile has a bound of $\Omega(1/s)$ (with batch size $s$ and typically $s \ll \sqrt{n}$). Experiments on various benchmark datasets and deep models show that DPSM significantly outperforms the best prior conformal training baseline with $20.46\\%\downarrow$ in the prediction set size and validates our theory.

Lay Summary: When machine learning models make predictions, it is important to know how confident they are—especially in high-stakes situations like medical diagnosis. Conformal prediction is a method that can wrap around any model to provide a set of likely answers, with a guarantee that the true answer is included. However, these sets are often too large to be useful. In this work, we propose a new method called Direct Prediction Set Minimization (DPSM) that helps models produce much smaller prediction sets—while still maintaining the same reliability. The main idea is to directly train the model to minimize the size of these prediction sets by combining two levels of learning: one that focuses on selecting the right threshold, and another that uses this threshold to guide learning. Our theoretical analysis shows that DPSM learns more efficiently than existing methods. Experiments on real-world datasets show that DPSM reduces the prediction set size by over $20\\%$ compared to previous best methods, making machine learning models both reliable and more practical.

Link To Code: https://github.com/YuanjieSh/DPSM_code

Primary Area: General Machine Learning

Keywords: conformal prediction, uncertainty quantification, deep classifiers, conformal training

Submission Number: 7858

Loading