Track: Tiny Paper Track (between 2 and 4 pages)
Keywords: learning-to-defer, risk control
Abstract: The learning to defer (L2D) framework gives a model the choice to defer prediction to an expert based on the model's uncertainty. We assume an L2D setting for sequence outputs where a small model can defer specific outputs of the whole model prediction to a large model in effort to interweave both models throughout the prediction. We propose a Learn then test approach to tune a token-level confidence-based thresholding rejector for pre-trained predictors with statistical guarantees of being within a user-defined budget and maximizing accuracy. We use Bayesian optimization to efficiently search the space of thresholds. In the experiments, we also empirically demonstrate that this method can achieve budget control while maintaining prediction quality of prediction system in text summarization.
Submission Number: 37
Loading