TL;DR: We introduce the advantage metric to improve the PSRO framework for solving normal-form games.
Abstract: Solving the Nash equilibrium in normal-form games with large-scale strategy spaces presents significant challenges. Open-ended learning frameworks, such as PSRO and its variants, have emerged as effective solutions. However, these methods often lack an efficient metric for evaluating strategy improvement, which limits their effectiveness in approximating equilibria.
In this paper, we introduce a novel evaluative metric called Advantage, which possesses desirable properties inherently connected to the Nash equilibrium, ensuring that each strategy update approaches equilibrium.
Building upon this, we propose the Advantage Policy Space Response Oracle (A-PSRO), an innovative unified open-ended learning framework applicable to both zero-sum and general-sum games. A-PSRO leverages the Advantage as a refined evaluation metric, leading to a consistent learning objective for agents in normal-form games.
Experiments showcase that A-PSRO significantly reduces exploitability in zero-sum games and improves rewards in general-sum games, outperforming existing algorithms and validating its practical effectiveness.
Lay Summary: Game theory primarily studies strategic interactions among multiple rational agents, and it can be used to explain real-world scenarios in politics, economics, and common games such as chess and card games. Nash equilibrium represents a stable state achieved through strategic improvements by these agents and is often considered the strongest strategy in a game. Thus, solving for a Nash equilibrium is equivalent to finding the optimal solution of the game. Previous research proposed PSRO as an efficient algorithm for computing Nash equilibria, but its efficiency is affected by the randomness in strategy exploration. This paper introduces the advantage function as an evaluation metric for strategy exploration. With favorable theoretical properties, it accelerates the computation of Nash equilibria. Based on this, we propose the A-PSRO algorithm, which significantly improves equilibrium solving in games.
Primary Area: Theory->Game Theory
Keywords: PSRO, Game Theory, Nash Equilibrium
Submission Number: 4373
Loading