Learning Nash Equilibria in Normal-Form Games via Approximating Stationary Points

Linjian Meng; Youzhi Zhang; Wubing Chen; Wenbin Li; Tianpei Yang; Yang Gao

Learning Nash Equilibria in Normal-Form Games via Approximating Stationary Points

Linjian Meng, Youzhi Zhang, Wubing Chen, Wenbin Li, Tianpei Yang, Yang Gao

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Learning, Nash Equilibrium, Normal-Form Games

TL;DR: We propose a novel unbiased loss function for learning a Nash equilibrium in normal-form games via Deep Learning.

Abstract: Nash equilibrium (NE) plays an important role in game theory. However, learning an NE in normal-form games (NFGs) is a complex, non-convex optimization problem. Deep Learning (DL), the cornerstone of modern artificial intelligence, has demonstrated remarkable empirical performance across various applications involving non-convex optimization. However, applying DL to learn an NE poses significant difficulties since most existing loss functions for using DL to learn an NE introduce bias under sampled play. A recent work proposed an unbiased loss function. Unfortunately, it suffers from high variance, which degrades the convergence rate. Moreover, learning an NE through this unbiased loss function entails finding a global minimum in a non-convex optimization problem, which is inherently difficult. To improve the convergence rate by mitigating the high variance associated with the existing unbiased loss function, we propose a novel loss function, named Nash Advantage Loss (NAL). NAL is unbiased and exhibits significantly lower variance than the existing unbiased loss function. In addition, an NE is a stationary point of NAL rather than having to be a global minimum, which improves the computational efficiency. Experimental results demonstrate that the algorithm minimizing NAL achieves significantly faster empirical convergence rates compared to previous algorithms, while also reducing the variance of estimated loss value by several orders of magnitude.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6366

Loading