Learning to Defer with an Uncertain Rejector via Conformal Prediction

Yizirui Fang; Eric Nalisnick

Learning to Defer with an Uncertain Rejector via Conformal Prediction

Yizirui Fang, Eric Nalisnick

Published: 11 Feb 2026, Last Modified: 11 Feb 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Learning to defer (L2D) aims to optimize human-AI collaboration by allocating prediction tasks to either a machine learning model or a human expert, depending on which is most likely to be correct. This allocation decision is governed by a rejector: a meta-model that routes inputs based on estimated success probabilities. In practice, a poorly fit or otherwise misspecified rejector can jeopardize the entire L2D workflow due to its crucial role in allocating prediction tasks. In this work, we perform uncertainty quantification for the rejector. We use conformal prediction to allow the rejector to output prediction sets or intervals instead of just the binary outcome of ‘defer’ or not. On tasks ranging from image to hate speech classification, we demonstrate that the uncertainty in the rejector translates to safer decisions via two forms of selective prediction.

Submission Type: Regular submission (no more than 12 pages of main content)

Code: https://github.com/yizirui/conformal_L2D

Assigned Action Editor: ~Manuel_Haussmann1

Submission Number: 6288

Loading