A Policy Optimization Approach to the Solution of Unregularized Mean Field Games

Published: 17 Jun 2024, Last Modified: 27 Jun 2024FoRLaC PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We study the problem of finding the equilibrium of a mean field game (MFG) -- a policy performing optimally in a Markov decision process (MDP) determined by the mean field, which is a distribution over a population of agents and a function of the policy. Prior solution techniques build upon fixed-point iteration and are only guaranteed to solve a regularized approximation of the problem, with a regularization constant large enough to ensure that the equilibrium is the unique fixed point of a contraction mapping. This leads to a regularized solution that can deviate arbitrarily from the original equilibrium. In this work, for the first time, we demonstrate how direct gradient-based policy optimization instead of fixed-point iteration, may solve the original, unregularized infinite-horizon average-reward MFG. In particular, we propose Accelerated Single-loop Actor Critic Algorithm for Mean Field Games (ASAC-MFG), which by its namesake, is completely data-driven, single-loop, and single-sample-path. We characterize the finite-time and finite-sample convergence of the ASAC-MFG algorithm to a mean field equilibrium building on a novel multi-time-scale analysis without regularization. We support the theoretical results with numerical simulations that illustrate the superior convergence of the proposed algorithm.
Format: Long format (up to 8 pages + refs, appendix)
Publication Status: No
Submission Number: 11
Loading