Does Momentum Help in Stochastic Optimization? A Sample Complexity Analysis.Download PDF

Published: 08 May 2023, Last Modified: 26 Jun 2023UAI 2023Readers: Everyone
Keywords: Heavy-ball, Nesterov, momentum, sample complexity
Abstract: Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in optimization. While the benefits of these acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization are unclear. Several works have recently claimed that SHB and ASG always help in stochastic optimization. Our work shows that i.) these claims are either flawed or one-sided (e.g., consider only the bias term but not the variance), and ii.) when both these terms are accounted for, SHB and ASG do not always help. Specifically, for \textit{any} quadratic optimization, we obtain a lower bound on the sample complexity of SHB and ASG, accounting for both bias and variance, and show that the vanilla SGD can achieve the same bound.
Supplementary Material: pdf
Other Supplementary Material: zip
0 Replies