Efficient Continual Finite-Sum Minimization

Ioannis Mavrothalassitis; Stratis Skoulakis; Leello Tadesse Dadi; Volkan Cevher

Efficient Continual Finite-Sum Minimization

Ioannis Mavrothalassitis, Stratis Skoulakis, Leello Tadesse Dadi, Volkan Cevher

Published: 16 Jan 2024, Last Modified: 21 Apr 2024ICLR 2024 posterEveryoneRevisionsBibTeX

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Finite Sum Minimization, Variance Reduction, Optimization

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Given a sequence of functions $f_1,\ldots,f_n$ with $f_i:\mathcal{D}\mapsto \mathbb{R}$, finite-sum minimization seeks a point ${x}^\star \in \mathcal{D}$ minimizing $\sum_{j=1}^nf_j(x)/n$. In this work, we propose a key twist into the finite-sum minimization, dubbed as *continual finite-sum minimization*, that asks for a sequence of points $x_1^\star, \ldots, x_n^\star \in D$ such that each ${x}^\star_i \in D$ minimizes the prefix-sum $\sum_{j=1}^if_j(x)/i$. Assuming that each prefix-sum is strongly convex, we develop a first-order continual stochastic variance reduction gradient method ($\mathrm{CSVRG}$) producing an $\epsilon$-optimal sequence with $\tilde{\mathcal{O}}(n/\epsilon^{1/3} + 1/\sqrt{\epsilon})$ overall *first-order oracles* (FO). An FO corresponds to the computation of a single gradient $\nabla f_j(x)$ at a given $x \in \mathcal{D}$ for some $j \in [n]$. Our approach significantly improves upon the $\mathcal{O}(n/\epsilon)$ FOs that $\mathrm{StochasticGradientDescent}$ requires and the $\mathcal{O}(n^2 \log (1/\epsilon))$ FOs that state-of-the-art variance reduction methods such as $\mathrm{Katyusha}$ require. We also prove that there is no natural first-order method with $\mathcal{O}\left(n/\epsilon^\alpha\right)$ gradient complexity for $\alpha < 1/4$, establishing that the first-order complexity of our method is nearly tight.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: zip

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Primary Area: optimization

Submission Number: 7163

Loading