TL;DR: Non-stationary Online Learning for Curved Losses: Improved Dynamic Regret via Mixability
Abstract: Non-stationary online learning has drawn much attention in recent years. Despite considerable progress, dynamic regret minimization has primarily focused on convex functions, leaving the functions with stronger curvature (e.g., squared or logistic loss) underexplored. In this work, we address this gap by showing that the regret can be substantially improved by leveraging the concept of mixability, a property that generalizes exp-concavity to effectively capture loss curvature. Let $d$ denote the dimensionality and $P_T$ the path length of comparators that reflects the environmental non-stationarity. We demonstrate that an exponential-weight method with fixed-share updates achieves an $\mathcal{O}(d T^{1/3} P_T^{2/3} \log T)$ dynamic regret for mixable losses, improving upon the best-known $\mathcal{O}(d^{10/3} T^{1/3} P_T^{2/3} \log T)$ result (Baby & Wang, 2021) in $d$. More importantly, this improvement arises from a simple yet powerful analytical framework that exploits the mixability, which avoids the Karush–Kuhn–Tucker-based analysis required by existing work.
Lay Summary: In this paper, we study how to learn from and make predictions with streaming data in non-stationary environments, where data arrive sequentially and their patterns can change over time. Our goal is to maintain a model that consistently makes accurate predictions throughout the entire sequence. This work is primarily theoretical, focusing on developing methods with strong mathematical guarantees on their performance. Specifically, we measure the performance of our methods using dynamic regret, where lower values indicate better adaptability to changing environments. Our main result is that by leveraging the curvature of the loss function, one can achieve better theoretical guarantees than methods that do not exploit this property. Similar results were achieved by previous work, but our method further improves the theoretical guarantees while using a simple yet effective analytical framework that avoids the complex analysis employed in earlier work.
Primary Area: General Machine Learning->Online Learning, Active Learning and Bandits
Keywords: Online Learning, Dynamic Regret, Mixability
Submission Number: 9566
Loading