Adaptive Online Convex Optimization via Sparse-Low-Rank Gradient Decomposition

Adaptive Online Convex Optimization via Sparse-Low-Rank Gradient Decomposition

04 Apr 2026 (modified: 20 Apr 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Adaptive gradient methods for online convex optimization must choose between diagonal preconditioning, which exploits gradient sparsity, and full-matrix preconditioning, which exploits low-rank structure, but not both simultaneously. We propose SLR-FTRL, an algorithm that decomposes the gradient preconditioner into sparse and low-rank components via an online robust PCA oracle, runs two parallel FTRL sub-algorithms with structurally matched preconditioners, and combines their outputs through a coin-betting meta-algorithm that requires no knowledge of the structural parameters. We prove a regret bound of the form $\min_{\alpha^*}\{\alpha^* R_T^L + (1-\alpha^*) R_T^S\} + \widetilde{O}(\sqrt{T})$ that recovers the classical diagonal and full-matrix AdaGrad guarantees as special cases when one structure is absent, with all cross-contamination and preconditioner-lag corrections explicitly retained. We further establish per-term lower bounds showing that the structural dependence on $\sqrt{r \cdot \sigma_1(\mathbf{G}_T^L)}$ and $\sqrt{s \cdot \|\mathbf{G}_T^S\|_\infty}$ is individually tight up to constants. Experiments on online regression with structured gradients confirm the theoretical predictions, demonstrating sublinear regret, pure-case recovery, dimension independence, and graceful degradation under decomposition noise.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Zhiyu_Zhang1

Submission Number: 8256

Loading