Dynamically Learning the Learning Rates:  Online Hyperparameter Optimization

Tuhin Sarkar; Anima Anandkumar; Leo Dirac

Dynamically Learning the Learning Rates: Online Hyperparameter Optimization

Tuhin Sarkar, Anima Anandkumar, Leo Dirac

03 Jan 2018 (modified: 25 Jan 2018)ICLR 2018 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Hyperparameter tuning is arguably the most important ingredient for obtaining state of art performance in deep networks. We focus on hyperparameters that are related to the optimization algorithm, e.g. learning rates, which have a large impact on the training speed and the resulting accuracy. Typically, fixed learning rate schedules are employed during training. We propose Hyperdyn a dynamic hyperparameter optimization method that selects new learning rates on the fly at the end of each epoch. Our explore-exploit framework combines Bayesian optimization (BO) with a rejection strategy, based on a simple probabilistic wait and watch test. We obtain state of art accuracy results on CIFAR and Imagenet datasets, but with significantly faster training, when compared with the best manually tuned networks.

TL;DR: Bayesian optimization based online hyperparameter optimization.

Keywords: hyperparameters, optimization, SGD, Adam, Bayesian

3 Replies

Loading