Selective Prediction via Training Dynamics

TMLR Paper3013 Authors

17 Jul 2024 (modified: 18 Nov 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Selective Prediction is the task of rejecting inputs a model would predict incorrectly on. This involves a trade-off between input space coverage (how many data points are accepted) and model utility (how good is the performance on accepted data points). Current methods for selective prediction typically impose constraints on either the model architecture or the optimization objective; this inhibits their usage in practice and introduces unknown interactions with pre-existing loss functions. In contrast to prior work, we show that state-of-the-art se- lective prediction performance can be attained solely from studying the (discretized) training dynamics of a model. We propose a general framework that, given a test input, monitors metrics capturing the instability of predictions from intermediate models (i.e., checkpoints) obtained during training w.r.t. the final model’s prediction. In particular, we reject data points exhibiting too much disagreement with the final prediction at late stages in training. The proposed rejection mechanism is domain-agnostic (i.e., it works for both discrete and real-valued prediction) and can be flexibly combined with existing selective prediction approaches as it does not require any train-time modifications. Our experimental evaluation on image classification, regression, and time series problems shows that our method beats past state-of-the-art accuracy/utility trade-offs on typical selective prediction benchmarks.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: N/A
Assigned Action Editor: ~Yiming_Ying1
Submission Number: 3013
Loading