Efficient Learning rate schedules for Stochastic Non-negative Matrix Factorization via Reinforcement Learning
Keywords: optimization, learning to learn, neural network training
TL;DR: In this work, we derive the theoretical bound for the learning rate schedule for NMF trained with SGD
Abstract: For deep learning training, learning rate schedules are often picked through trial and error, or hand-crafted optimization algorithms that focus mostly on maintaining stability and convergence without systemic incorporation of higher order derivative information to optimize the convergence slope. In this paper, we consider a stochastic version of Non-negative Matrix Factorization (NMF) where only a noisy gradient is known, and calculate a theoretical upper bound for SGD learning rate (LR) schedule that guarantees convergence, thereby providing a clean example where stability and convergence is not a challenge. We then use a Reinforcement Learning agent to demonstrate how efficient LR schedules, superior to those found by traditional algorithms, can be found for this NMF problem.
9 Replies
Loading