Heavy Ball Momentum for Conditional GradientDownload PDF

21 May 2021, 20:43 (modified: 21 Jan 2022, 20:25)NeurIPS 2021 PosterReaders: Everyone
Keywords: Frank Wolfe, conditional gradient, heavy ball momentum
TL;DR: Heavy ball momentum tightens the primal dual error of Frank-Wolfe (aka conditional gradient) method.
Abstract: Conditional gradient, aka Frank Wolfe (FW) algorithms, have well-documented merits in machine learning and signal processing applications. Unlike projection-based methods, momentum cannot improve the convergence rate of FW, in general. This limitation motivates the present work, which deals with heavy ball momentum, and its impact to FW. Specifically, it is established that heavy ball offers a unifying perspective on the primal-dual (PD) convergence, and enjoys a tighter \textit{per iteration} PD error rate, for multiple choices of step sizes, where PD error can serve as the stopping criterion in practice. In addition, it is asserted that restart, a scheme typically employed jointly with Nesterov's momentum, can further tighten this PD error bound. Numerical results demonstrate the usefulness of heavy ball momentum in FW iterations.
Supplementary Material: pdf
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
Code: https://github.com/BingcongLi/HFW
9 Replies