Keywords: learned optimizer, neural network, training, reinforcement learning
TL;DR: In this work, we propose a novel RL-based learning rate scheduler, that learns to predict optimal learning rates based on the training progress.
Abstract: We present a novel strategy to generate learned learning rate schedules for any optimizer using reinforcement learning (RL). Our approach trains a Proximal Policy Optimization (PPO) agent to predict optimal learning rate schedules for SGD, which we compare with other optimizer-scheduler combinations and full grid search. Our experiments show that the agent learns to generate dynamic schedules that result in stable, non-divergent loss histories, and can be more useful in practice than equally-expensive Hyperparameter Optimization and fixed optimizer-scheduler combinations.
9 Replies
Loading