CLASP: An online learning algorithm for Convex Losses And Squared Penalties

CLASP: An online learning algorithm for Convex Losses And Squared Penalties

ICLR 2026 Conference Submission21304 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Regret bounds, Constrained Online Convex Optimization, Strong convexity, Firmly Nonexpansive Operators, Learning Theory

TL;DR: We propose CLASP, a novel algorithm for constrained online convex optimization with squared penalties, giving the first logarithmic bounds on regret and constraint violations in the strongly convex setting.

Abstract: We study Constrained Online Convex Optimization (COCO), where a learner chooses actions iteratively, observes both unanticipated convex loss and convex constraint, and accumulates loss while incurring penalties for constraint violations. We introduce CLASP (Convex Losses And Squared Penalties), an algorithm that minimizes cumulative loss together with squared constraint violations. Our analysis departs from prior work by fully leveraging the firm non-expansiveness of convex projectors, a proof strategy not previously applied in this setting. For convex losses, CLASP achieves regret $O\left(T^{\max\\{\beta,1-\beta\\}}\right)$ and cumulative squared penalty $O\left(T^{1-\beta}\right)$ for any $\beta \in (0,1)$. Most importantly, for strongly convex losses, CLASP provides the first logarithmic guarantees on both regret and cumulative squared penalty. In the strong convex case, the regret is upper bounded by $O( \log T )$ and the cumulative squared penalty is also upper bounded by $O( \log T )$.

Primary Area: optimization

Submission Number: 21304

Loading