Unconstrained Robust Online Convex Optimization

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: This paper addresses unconstrained online learning with adversarially ''corrupted'' feedback.
Abstract: This paper addresses online learning with ''corrupted'' feedback. Our learner is provided with potentially corrupted gradients $\tilde g_t$ instead of the ''true'' gradients $g_t$. We make no assumptions about how the corruptions arise: they could be the result of outliers, mislabeled data, or even malicious interference. We focus on the difficult ``unconstrained'' setting in which our algorithm must maintain low regret with respect to any comparison point $u \in \mathbb{R}^d$. The unconstrained setting is significantly more challenging as existing algorithms suffer extremely high regret even with very tiny amounts of corruption (which is not true in the case of a bounded domain). Our algorithms guarantee regret $ \|u\|G (\sqrt{T} + k) $ when $G \ge \max_t \|g_t\|$ is known, where $k$ is a measure of the total amount of corruption. When $G$ is unknown we incur an extra additive penalty of $(\|u\|^2+G^2) k$.
Lay Summary: Modern machine learning systems often rely on feedback to learn over time, but what happens if that feedback is wrong or misleading? For example, an algorithm might learn from mislabeled data, noisy measurements, or even adversarial attacks. Existing study have addressed such problem when domain is constrained, but this assumption is not true in many real-world situations. In this work, we show the case of unconstrained domain, an appropriate regularization could address the problem. This research lays a theoretical foundation for learning in the unconstrained domain through corrupted feedback.
Primary Area: General Machine Learning->Online Learning, Active Learning and Bandits
Keywords: online learning, corrupted feedback, comparator adaptive
Submission Number: 5104
Loading