Variance Reduced Model Based Methods: New rates and adaptive step sizes

Published: 26 Oct 2023, Last Modified: 13 Dec 2023NeurIPS 2023 Workshop PosterEveryoneRevisionsBibTeX
Keywords: variance reduced, finite sum minimization, polyak step size, model based method, adaptive learning rates
TL;DR: We give a new adaptive learning rate for SAG that convergence in non-smooth, smooth, and strongly convex setting without having access to the smoothness constant.
Abstract: Variance reduced gradients methods were introduced to control the variance of SGD (Stochastic Gradient Descent). Model-based methods are able to make use of a known lower bound on the loss, for instance, most loss functions are positive. We show how these two classes of methods can be seamlessly combined. As an example we present a Model-based Stochastic Average Gradient method MSAG, which results from using a truncated model together with the SAG method. At each iteration MSAG computes an adaptive learning rate based on a given known lower bound. When given access to the optimal objective as the lower bound, MSAG has several favorable convergence properties, including monotonic iterates, and convergence in the non-smooth, smooth and strongly convex setting. Our convergence theorems show that we can essentially trade-off knowing the smoothness constant $L_{\max}$ for knowing the optimal objective to achieve the favourable convergence of variance reduced gradient methods. Moreover our convergence proofs for MSAG are very simple, which is in contrast to complexity of the original convergence proofs of SAG.
Submission Number: 65