Dynamically Learning the Learning Rates: Online Hyperparameter Optimization


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Hyperparameter tuning is arguably the most important ingredient for obtaining state of art performance in deep networks. We focus on hyperparameters that are related to the optimization algorithm, e.g. learning rates, which have a large impact on the training speed and the resulting accuracy. Typically, fixed learning rate schedules are employed during training. We propose Hyperdyn a dynamic hyperparameter optimization method that selects new learning rates on the fly at the end of each epoch. Our explore-exploit framework combines Bayesian optimization (BO) with a rejection strategy, based on a simple probabilistic wait and watch test. We obtain state of art accuracy results on CIFAR and Imagenet datasets, but with significantly faster training, when compared with the best manually tuned networks.
  • TL;DR: Bayesian optimization based online hyperparameter optimization.
  • Keywords: hyperparameters, optimization, SGD, Adam, Bayesian