Closing the gap between the upper bound and lower bound of Adam's iteration complexity

Bohan Wang; Jingwen Fu; Huishuai Zhang; Nanning Zheng; Wei Chen

Closing the gap between the upper bound and lower bound of Adam's iteration complexity

Bohan Wang, Jingwen Fu, Huishuai Zhang, Nanning Zheng, Wei Chen

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: Adam, Convergence, Upper Bound, Lower Bound

TL;DR: We prove an upper bound of Adam's iteration complexity, which meets the lower bound for the first time.

Abstract: Recently, Arjevani et al. [1] establish a lower bound of iteration complexity for the first-order optimization under an $L$-smooth condition and a bounded noise variance assumption. However, a thorough review of existing literature on Adam's convergence reveals a noticeable gap: none of them meet the above lower bound. In this paper, we close the gap by deriving a new convergence guarantee of Adam, with only an $L$-smooth condition and a bounded noise variance assumption. Our results remain valid across a broad spectrum of hyperparameters. Especially with properly chosen hyperparameters, we derive an upper bound of the iteration complexity of Adam and show that it meets the lower bound for first-order optimizers. To the best of our knowledge, this is the first to establish such a tight upper bound for Adam's convergence. Our proof utilizes novel techniques to handle the entanglement between momentum and adaptive learning rate and to convert the first-order term in the Descent Lemma to the gradient norm, which may be of independent interest.

Supplementary Material: pdf

Submission Number: 1718

Loading