Adaptive Labeling for Efficient Out-of-distribution Model Evaluation

Daksh Mittal; Yuanzhe Ma; Shalmali Joshi; Hongseok Namkoong

Adaptive Labeling for Efficient Out-of-distribution Model Evaluation

Daksh Mittal, Yuanzhe Ma, Shalmali Joshi, Hongseok Namkoong

Published: 25 Sept 2024, Last Modified: 09 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Model Evaluation, Uncertainty Quantification, Markov Decision Process, Policy Gradient, Auto-differentiation

TL;DR: Supervised data suffers severe selection bias when labels are expensive. We formulate a MDP over posterior beliefs on model performance and solve it with pathwise policy gradients computed through an auto-differentiable pipeline.

Abstract: Datasets often suffer severe selection bias; clinical labels are only available on patients for whom doctors ordered medical exams. To assess model performance outside the support of available data, we present a computational framework for adaptive labeling, providing cost-efficient model evaluations under severe distribution shifts. We formulate the problem as a Markov Decision Process over states defined by posterior beliefs on model performance. Each batch of new labels incurs a “state transition” to sharper beliefs, and we choose batches to minimize uncertainty on model performance at the end of the label collection process. Instead of relying on high-variance REINFORCE policy gradient estimators that do not scale, our adaptive labeling policy is optimized using path-wise policy gradients computed by auto-differentiating through simulated roll-outs. Our framework is agnostic to different uncertainty quantification approaches and highlights the virtue of planning in adaptive labeling. On synthetic and real datasets, we empirically demonstrate even a one-step lookahead policy substantially outperforms active learning-inspired heuristics.

Primary Area: Evaluation (methodology, meta studies, replicability and validity)

Submission Number: 13864

Loading