Optimistic Acceleration for Optimization

Jun-Kun Wang; Xiaoyun Li; Ping Li

Optimistic Acceleration for Optimization

Jun-Kun Wang, Xiaoyun Li, Ping Li

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: We consider new variants of optimization algorithms. Our algorithms are based on the observation that mini-batch of stochastic gradients in consecutive iterations do not change drastically and consequently may be predictable. Inspired by the similar setting in online learning literature called Optimistic Online learning, we propose two new optimistic algorithms for AMSGrad and Adam, respectively, by exploiting the predictability of gradients. The new algorithms combine the idea of momentum method, adaptive gradient method, and algorithms in Optimistic Online learning, which leads to speed up in training deep neural nets in practice.

Keywords: optimization, Adam, AMSGrad

TL;DR: We consider new variants of optimization algorithms for training deep nets.

Data: [IMDb Movie Reviews](https://paperswithcode.com/dataset/imdb-movie-reviews)

12 Replies

Loading