Budget-Constrained Learning to Defer for Autoregressive Models

Budget-Constrained Learning to Defer for Autoregressive Models

ICLR 2025 Workshop BuildingTrust Submission37 Authors

07 Feb 2025 (modified: 06 Mar 2025)Submitted to BuildingTrustEveryoneRevisionsBibTeXCC BY 4.0

Track: Tiny Paper Track (between 2 and 4 pages)

Keywords: learning-to-defer, risk control

Abstract: The learning to defer (L2D) framework gives a model the choice to defer prediction to an expert based on the model's uncertainty. We assume an L2D setting for sequence outputs where a small model can defer specific outputs of the whole model prediction to a large model in effort to interweave both models throughout the prediction. We propose a Learn then test approach to tune a token-level confidence-based thresholding rejector for pre-trained predictors with statistical guarantees of being within a user-defined budget and maximizing accuracy. We use Bayesian optimization to efficiently search the space of thresholds. In the experiments, we also empirically demonstrate that this method can achieve budget control while maintaining prediction quality of prediction system in text summarization.

Submission Number: 37

Loading