Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

Published: 01 Jan 2019, Last Modified: 30 Sept 2024CoRR 2019EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We propose probabilistic models that can extrapolate learning curves of iterative machine learning algorithms, such as stochastic gradient descent for training deep networks, based on training data with variable-length learning curves. We study instantiations of this framework based on random forests and Bayesian recurrent neural networks. Our experiments show that these models yield better predictions than state-of-the-art models from the hyperparameter optimization literature when extrapolating the performance of neural networks trained with different hyperparameter settings.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview