CONTINUAL FINITE-SUM MINIMIZATION UNDER THE POLYAK-ŁOJASIEWICZ CONDITION

Ioannis Mavrothalassitis; Stratis Skoulakis; Elias Abad Rocamora; Andrej Janchevski; Volkan Cevher

CONTINUAL FINITE-SUM MINIMIZATION UNDER THE POLYAK-ŁOJASIEWICZ CONDITION

Ioannis Mavrothalassitis, Stratis Skoulakis, Elias Abad Rocamora, Andrej Janchevski, Volkan Cevher

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Continual Learning, Finite Sum Minimization

Abstract: Given functions $f_1,\ldots,f_n$ where $f_i:\mathcal{D}\mapsto \mathbb{R}$, \textit{continual finite-sum minimization} (CFSM) asks for an $\epsilon$-optimal sequence $\hat{x}_1,\ldots,\hat{x}_n \in \mathcal{D}$ such that $$\sum_{j=1}^i f_j(x_i)/i - \min_{x \in \mathcal{D}}\sum_{j=1}^if_j(x)/i \leq \epsilon$$ In this work, we develop a new CFSM framework under the Polyak-Łojasiewicz condition (PL), where each prefix-sum function $\sum_{j=1}^i f_j(x)/i$ satisfies the PL condition, extending the recent result on CFSM with strongly convex functions. We present a new first-order method that under the PL condition producing an $\epsilon$-optimal sequence with overall $\mathcal{O}(n/\sqrt{\epsilon})$ first-order oracles (FOs), where an FO corresponds to the computation of a single gradient $\nabla f_j(x)$ at a given $x \in \mathcal{D}$ for some $j \in [n]$. Our method also improves upon the $\mathcal{O}(n^2 \log (1/\epsilon))$ FO complexity of state-of-the art variance reduction methods as well as upon the $\mathcal{O}(n/\epsilon)$ FO complexity of $\mathrm{StochasticGradientDescent}$. We experimentally evaluate our method in continual learning and the unlearning settings, demonstrating the potential of the CFSM framework in non-convex, deep learning problems.

Supplementary Material: zip

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10980

Loading