Keywords: Regret bounds, Constrained Online Convex Optimization, Strong convexity, Firmly Nonexpansive Operators, Learning Theory
TL;DR: We propose CLASP, a novel algorithm for constrained online convex optimization with squared penalties, giving the first logarithmic bounds on regret and constraint violations in the strongly convex setting.
Abstract: We study Constrained Online Convex Optimization (COCO), where a learner chooses actions iteratively, observes both unanticipated convex loss and convex constraint, and accumulates loss while incurring penalties for constraint violations. We introduce CLASP (Convex Losses And Squared Penalties), an algorithm that minimizes cumulative loss together with squared constraint violations. Our analysis departs from prior work by fully leveraging the firm non-expansiveness of convex projectors, a proof strategy not previously applied in this setting. For convex losses, CLASP achieves regret $O\left(T^{\max\\{\beta,1-\beta\\}}\right)$ and cumulative squared penalty $O\left(T^{1-\beta}\right)$ for any $\beta \in (0,1)$. Most importantly, for strongly convex losses, CLASP provides the first logarithmic guarantees on both regret and cumulative squared penalty. In the strong convex case, the regret is upper bounded by $O( \log T )$ and the cumulative squared penalty is also upper bounded by $O( \log T )$.
Primary Area: optimization
Submission Number: 21304
Loading