Bridging Fairness and Efficiency in Conformal Inference: A Surrogate-Assisted Group-Clustered Approach

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0
TL;DR: We propose a novel surrogate-assisted clustered conformal inference framework that empowers the construction of efficient prediction sets by pooling the protected groups into larger clusters and leveraging the surrogates.
Abstract: Standard conformal prediction ensures marginal coverage but consistently undercovers underrepresented groups, limiting its reliability for fair uncertainty quantification. Group fairness requires prediction sets to achieve a user-specified coverage level within each protected group. While group-wise conformal inference meets this requirement, it often produces excessively wide prediction sets due to limited sample sizes in underrepresented groups, highlighting a fundamental tradeoff between fairness and efficiency. To bridge this gap, we introduce Surrogate-Assisted Group-Clustered Conformal Inference (SAGCCI), a framework that improves efficiency through two key innovations: (1) clustering protected groups with similar conformal score distributions to enhance precision while maintaining fairness, and (2) deriving an efficient influence function that optimally integrates surrogate outcomes to construct tighter prediction sets. Theoretically, SAGCCI guarantees approximate group-conditional coverage in a doubly robust manner under mild convergence conditions, enabling flexible nuisance model estimation. Empirically, through simulations and an analysis of the phase 3 Moderna COVE COVID-19 vaccine trial, we demonstrate that SAGCCI outperforms existing methods, producing narrower prediction sets while maintaining valid group-conditional coverage, effectively balancing fairness and efficiency in uncertainty quantification.
Lay Summary: How can we ensure that predictions from machine learning models are both fair and reliable — especially for underrepresented groups? A popular approach called conformal prediction provides uncertainty estimates that work well on average but often fails to give equally good results for different subgroups. In particular, it tends to underestimate uncertainty for smaller or underrepresented groups, which can lead to unfair or misleading predictions. Our work introduces a new method, called Surrogate-Assisted Group-Clustered Conformal Inference (SAGCCI), that tackles this fairness-efficiency tradeoff. It improves the quality of uncertainty estimates by clustering groups with similar patterns and borrowing information across them, so predictions can be more precise without losing fairness. It also uses auxiliary information — called “surrogates” — to sharpen predictions even when direct data is limited. We show mathematically that our method meets fairness goals under broad conditions and performs well even when models are misspecified. In both simulations and a real-world COVID-19 vaccine study, SAGCCI gave fairer and tighter prediction intervals than existing methods, suggesting a promising way forward for equitable and efficient machine learning.
Primary Area: General Machine Learning->Evaluation
Keywords: Conformal inference, Individualized prediction, Clustering, Surrogate outcomes, Group fairness
Submission Number: 5158
Loading