Learning Nash Equilibrium of Markov Potential Games with a Shared Constraint via Primal-Dual Optimization
Abstract: The problem of constrained Markov game has recently attracted interests in the study of multi-agent reinforcement learning (MARL). The existing literature has focused on safe MARL problems where safety constraints are imposed for each agent individually. In this work, we consider Markov potential game (MPG) with a shared constraint, where the cost function with respect to the constraint depends on states and joint actions of all agents. We adopt a primal-dual framework to tackle the problem and establish the Slater condition to ensure the strong duality. Moreover, we propose our primal-dual learning algorithm for learning approximate Nash equilibrium in MPG with shared constraint. Thanks to the novel design of the dual update, we provide asymptotic convergence on the weighted output policy. Specifically, we prove that both the value function gap and the constraint violation of the output policy converge at the rate O(epsilon+1/sqrt(T)), where epsilon is the accuracy level of the primal update, and T is the number of iterations. We further show that the weighted output policy outperforms the existing uniformly chosen policy.
Loading