Minibatch and Momentum Model-based Methods for Stochastic Weakly Convex Optimization

Qi Deng; Wenzhi Gao

Minibatch and Momentum Model-based Methods for Stochastic Weakly Convex Optimization

Qi Deng, Wenzhi Gao

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone

Keywords: model-based optimization, weakly convex optimization, stochastic optimization, algorithm stability, momentum methods

TL;DR: We justify that minibatching can accelerate convergence of many stochastic algorithms theoretically for a wide range of problems; and also present a general scheme to add momentum to stochastic algorithms

Abstract: Stochastic model-based methods have received increasing attention lately due to their appealing robustness to the stepsize selection and provable efficiency guarantee. We make two important extensions for improving model-based methods on stochastic weakly convex optimization. First, we propose new minibatch model- based methods by involving a set of samples to approximate the model function in each iteration. For the first time, we show that stochastic algorithms achieve linear speedup over the batch size even for non-smooth and non-convex (particularly, weakly convex) problems. To this end, we develop a novel sensitivity analysis of the proximal mapping involved in each algorithm iteration. Our analysis appears to be of independent interests in more general settings. Second, motivated by the success of momentum stochastic gradient descent, we propose a new stochastic extrapolated model-based method, greatly extending the classic Polyak momentum technique to a wider class of stochastic algorithms for weakly convex optimization. The rate of convergence to some natural stationarity condition is established over a fairly flexible range of extrapolation terms. While mainly focusing on weakly convex optimization, we also extend our work to convex optimization. We apply the minibatch and extrapolated model-based methods to stochastic convex optimization, for which we provide a new complexity bound and promising linear speedup in batch size. Moreover, an accelerated model-based method based on Nesterov’s momentum is presented, for which we establish an optimal complexity bound for reaching optimality.

Supplementary Material: zip

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

12 Replies

Loading