Projection-free Algorithms for Online Convex Optimization with Adversarial Constraints

Dhruv Sarkar; Aprameyo Chakrabartty; Subhamon Supantha; Palash Dey; Abhishek Sinha

Projection-free Algorithms for Online Convex Optimization with Adversarial Constraints

Dhruv Sarkar, Aprameyo Chakrabartty, Subhamon Supantha, Palash Dey, Abhishek Sinha

Published: 03 Feb 2026, Last Modified: 06 Feb 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We propose a fast projection-free algorithm for the problem of online convex optimization with adversarial constraints.

Abstract: We study a generalization of the Online Convex Optimization (OCO) framework with time-varying adversarial constraints. In this problem, after selecting a feasible action from the convex decision set $\mathcal{X},$ a convex constraint function is revealed alongside the cost function in each round. Our goal is to design a computationally efficient learning policy that achieves a small regret with respect to the cost functions and a small cumulative constraint violation (CCV) with respect to the constraint functions over a horizon of length $T$. It is well-known that the projection step constitutes the major computational bottleneck of the standard OCO algorithms. However, for many structured decision sets, linear functions can be efficiently optimized over the decision set. We propose a *projection-free* online policy which makes a single call to a Linear Optimization Oracle (LOO) per round. Our method outperforms state-of-the-art projection-free online algorithms with adversarial constraints, achieving bounds of $\tilde{O}(T^{\frac{3}{4}})$ for both regret and CCV without resorting to techniques like the doubling trick. The proposed algorithm is conceptually simple - it first constructs a surrogate cost function as a non-negative linear combination of the cost and constraint functions. Then, it passes the surrogate costs to a new, adaptive version of the online conditional gradient subroutine, which we propose in this paper. Our methodology is also extended to the bandit setting, where we identify the need for a new form of surrogate loss function to handle bandit feedback, a point missed in related literature, thereby establishing new state-of-the-art guarantees of $\tilde{O}(T^{\frac{3}{4}})$ for both expected regret and violation.

Submission Number: 147

Loading