TL;DR: An adaptive learn-then-test calibration procedure for efficient and statistically valid hyperparameter selection.
Abstract: We introduce adaptive learn-then-test (aLTT), an efficient hyperparameter selection procedure that provides finite-sample statistical guarantees on the population risk of AI models. Unlike the existing learn-then-test (LTT) technique, which relies on conventional p-value-based multiple hypothesis testing (MHT), aLTT implements sequential data-dependent MHT with early termination by leveraging e-processes. As a result, aLTT can reduce the number of testing rounds, making it particularly well-suited for scenarios in which testing is costly or presents safety risks. Apart from maintaining statistical validity, in applications such as online policy selection for offline reinforcement learning and prompt engineering, aLTT is shown to achieve the same performance as LTT while requiring only a fraction of the testing rounds.
Lay Summary: Selecting a reliable hyperparameter configuration for large-scale machine learning models is a costly operation, as it requires multiple rounds of testing across a set of candidate hyperparameter values. In this work, we propose adaptive learn-then-test (aLTT), a hyperparameter selection algorithm that performs this evaluation efficiently using e-value–based testing instead of p-values. The main advantage of aLTT is that it leverages data-dependent acquisition and termination, which greatly improve efficiency without sacrificing statistical validity. Our approach is shown to outperform prior methods in settings such as online policy selection and automated prompt engineering for large language models.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/kclip/aLTT
Primary Area: Probabilistic Methods->Everything Else
Keywords: Calibration, Learn-then-Test, Hyperparameter Selection
Submission Number: 7026
Loading