The Cyclical Chaos And Its Equilibrium

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Game Theory, Graph, Fixed Point Theorem, Noncooperative Game, Self-Play, Multi-Agent Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Finding a Nash Equilibrium (NE) in noncooperative games is a fundamental challenge in game theory and artificial intelligence, but existing methods can be computationally demanding to address the cyclical strategy problem. Existing methods like Policy Space Response Oracles (PSRO) allow agents to learn a best response (BR) policy against all prior policies. Once the learned policy converges, it is added to the sequence until an NE is identified. While the learning against all prior policies prevents agents' strategy interactions from descending into a cyclical chaos, this approach increases computational demands due to the expanding population of opponents. Our research offers a new perspective. We argue that cyclical strategies are not chaotic anomalies to be avoided; instead, they are orderly sequences integral to an equilibrium. We establish the theoretical equivalency between a complete set of cyclical strategies and the support set of a Mixed Strategy NE (MSNE). Our proof intuitively demonstrates that the cyclical strategies must form a circular counter, implying that a complete set is necessary to support an MSNE due to the intrinsic counterbalancing dynamic. This enables a novel graph search learning representation of self-play that finds an NE as a graph search. Our empirical results show improved self-play efficiency in discovering both a Pure Strategy NE (PSNE) and a MSNE in noncooperative games such as Connect4 and Naruto Mobile.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5085
Loading