Keywords: online learning, online convex optimization, parameter-free
TL;DR: We provide regret guarantees without prior bounds on Lipschitz constants of comparator norms.
Abstract:
We provide a technique for OLO that obtains regret $G|w_\star|\sqrt{T\log(|w_\star|G\sqrt{T})} + |w_\star|^2 + G^2$ on $G$-Lipschitz losses for any comparison point $w_\star$ without knowing either $G$ or $|w_\star|$. Importantly, this matches the optimal bound $G|w_\star|\sqrt{T}$ available with such knowledge (up to logarithmic factors), unless either $|w_\star|$ or $G$ is so large that even $G|w_\star|\sqrt{T}$ is roughly linear in $T$. Thus, at a high level it matches the optimal bound in all cases in which one can achieve sublinear regret.
Primary Area: Online learning
Submission Number: 12919
Loading