Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Incorporating Nesterov Momentum into Adam
Feb 18, 2016 (modified: Feb 18, 2016)ICLR 2016 workshop submissionreaders: everyone
Abstract:This work aims to improve upon the recently proposed and rapidly popular-
ized optimization algorithm Adam (Kingma & Ba, 2014). Adam has two main
components—a momentum component and an adaptive learning rate component.
However, regular momentum can be shown conceptually and empirically to be in-
ferior to a similar algorithm known as Nesterov’s accelerated gradient (NAG). We
show how to modify Adam’s momentum component to take advantage of insights
from NAG, and then we present preliminary evidence suggesting that making this
substitution improves the speed of convergence and the quality of the learned mod-
Enter your feedback below and we'll get back to you as soon as possible.