Near-Optimal No-Regret Learning in General Games

Constantinos Costis Daskalakis; Maxwell Fishelson; Noah Golowich

Near-Optimal No-Regret Learning in General Games

Constantinos Costis Daskalakis, Maxwell Fishelson, Noah Golowich

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 OralReaders: Everyone

Keywords: No-regret learning, coarse correlated equilibrium, Optimistic Hedge

TL;DR: We prove a poly-logarithmic regret bound for no-regret learners in general-sum games.

Abstract: We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her action in response to the history of play so far, then after $T$ rounds of interaction, each player experiences total regret that is ${\rm poly}(\log T)$. Our bound improves, exponentially, the $O(T^{1/2})$ regret attainable by standard no-regret learners in games, the $O(T^{1/4})$ regret attainable by no-regret learners with recency bias (Syrgkanis et al., NeurIPS 2015), and the $O(T^{1/6})$ bound that was recently shown for Optimistic Hedge in the special case of two-player games (Chen & Peng, NeurIPS 2020). A direct corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of $\tilde{O}(1/T)$.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

10 Replies

Loading