Understanding Accelerated Gradient Methods: Lyapunov Analyses and Hamiltonian-Assisted Interpretations
Abstract: We formulate two classes of first-order algorithms more general than previously studied for minimizing smooth and strongly convex or, respectively, smooth and convex functions. We establish sufficient conditions, via new discrete Lyapunov analyses, for achieving accelerated convergence rates which match Nesterov's methods in the strongly and general convex settings. Our results identify, for the first time, a simple and unified condition on gradient correction for accelerated convergence. Next, we study the convergence of limiting ordinary differential equations (ODEs), including high-resolution ODEs, and point out currently notable gaps between the convergence properties of the corresponding algorithms and ODEs, especially regarding the role of gradient correction. Finally, we propose a novel class of discrete algorithms, called the Hamiltonian-assisted gradient method, directly based on a Hamiltonian function and several interpretable operations, and then demonstrate meaningful and unified interpretations of our acceleration conditions in terms of the momentum variable updates.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: According to the comments and suggestions from the reviewers, we have made various improvements to the paper, including the following aspects:
- literature review and comparison;
- readability and notation issues;
- well-posedness of ODEs.
See the official comment for more details.
Assigned Action Editor: ~Eduard_Gorbunov1
Submission Number: 5664
Loading