Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

Yang Cai; Gabriele Farina; Julien Grand-Clément; Christian Kroer; Chung-Wei Lee; Haipeng Luo; Weiqiang Zheng

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: two-player zero-sum games, last-iterate convergence, regret matching, no-regret learning

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent ascent, which have strong last-iterate and ergodic convergence properties for zero-sum games, virtually nothing is known about the last-iterate properties of regret-matching algorithms. Since last-iterate convergence is an attractive property both for numerical optimization reasons and because no-regret learning is viewed as a plausible method of real-world learning in games. In this paper, we study the last-iterate convergence properties of various popular variants of RM$^+$. First, we show numerically that several practical variants such as simultaneous RM$^+$, alternating RM$^+$, and simultaneous predictive RM$^+$, all lack last-iterate convergence guarantees even on a simple $3\times 3$ game. Then, we go on to show that recent variants of these algorithms based on a *smoothing* technique do enjoy last-iterate convergence: we prove that *extragradient RM$^{+}$* and *smooth PRM$^+$* enjoy asymptotic last-iterate convergence (without a rate) and $1/\sqrt{t}$ best-iterate convergence. Finally, we introduce restarted variants of these algorithms, and show that in both cases they enjoy linear-rate last-iterate convergence.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4453

Loading