Massively Parallel Hyperparameter TuningDownload PDF

15 Feb 2018 (modified: 21 Apr 2024)ICLR 2018 Conference Blind SubmissionReaders: Everyone
Abstract: Modern machine learning models are characterized by large hyperparameter search spaces and prohibitively expensive training costs. For such models, we cannot afford to train candidate models sequentially and wait months before finding a suitable hyperparameter configuration. Hence, we introduce the large-scale regime for parallel hyperparameter tuning, where we need to evaluate orders of magnitude more configurations than available parallel workers in a small multiple of the wall-clock time needed to train a single model. We propose a novel hyperparameter tuning algorithm for this setting that exploits both parallelism and aggressive early-stopping techniques, building on the insights of the Hyperband algorithm. Finally, we conduct a thorough empirical study of our algorithm on several benchmarks, including large-scale experiments with up to 500 workers. Our results show that our proposed algorithm finds good hyperparameter settings nearly an order of magnitude faster than random search.
Keywords: parallel hyperparameter tuning, deep learning
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](
9 Replies