Accelerated Quasi-Newton Proximal Extragradient: Faster Rate for Smooth Convex Optimization

Published: 21 Sept 2023, Last Modified: 15 Jan 2024NeurIPS 2023 spotlightEveryoneRevisionsBibTeX
Keywords: convex optimization, quasi-Newton methods, Monteiro-Svaiter acceleration, Nesterov's accelerated gradient, online learning
TL;DR: We propose a quasi-Newton-type method that converges faster than Nesterov's accelerated gradient in the "small dimension" regime, while matching its optimal rate in the "large dimension" regime.
Abstract: In this paper, we propose an accelerated quasi-Newton proximal extragradient method for solving unconstrained smooth convex optimization problems. With access only to the gradients of the objective, we prove that our method can achieve a convergence rate of $\mathcal{O}\bigl(\min\\{\frac{1}{k^2}, \frac{\sqrt{d\log k}}{k^{2.5}}\\}\bigr)$, where $d$ is the problem dimension and $k$ is the number of iterations. In particular, in the regime where $k = \mathcal{O}(d)$, our method matches the _optimal rate_ of $\mathcal{O}(\frac{1}{k^2})$ by Nesterov's accelerated gradient (NAG). Moreover, in the the regime where $k = \Omega(d \log d)$, it outperforms NAG and converges at a _faster rate_ of $\mathcal{O}\bigl(\frac{\sqrt{d\log k}}{k^{2.5}}\bigr)$. To the best of our knowledge, this result is the first to demonstrate a provable gain for a quasi-Newton-type method over NAG in the convex setting. To achieve such results, we build our method on a recent variant of the Monteiro-Svaiter acceleration framework and adopt an online learning perspective to update the Hessian approximation matrices, in which we relate the convergence rate of our method to the dynamic regret of a specific online convex optimization problem in the space of matrices.
Supplementary Material: zip
Submission Number: 5911