Keywords: Networked Markov Potential Game, Multi-Agent Reinforcement Learning, Convergence Rates, Localized TD-Learning
TL;DR: We design a localized actor-critic algorithm for networked Markov potential games with provable finite-time convergence guarantees.
Abstract: We introduce a class of networked Markov potential games where agents are associated with nodes in a network. Each agent has its own local potential function, and the reward of each agent depends only on the states and actions of agents within a neighborhood. In this context, we propose a localized actor-critic algorithm. The algorithm is scalable since each agent uses only local information and does not need access to the global state. Further, the algorithm overcomes the curse of dimensionality through the use of function approximation. Our main results provide finite-sample guarantees up to a localization error and a function approximation error. Specifically, we achieve an $\tilde{\mathcal{O}}(\tilde{\epsilon}^{-4})$ sample complexity measured by the averaged Nash regret. This is the first finite-sample bound for multi-agent competitive games that does not depend on the number of agents.
Supplementary Material: pdf
Other Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/convergence-rates-for-localized-actor-critic/code)
0 Replies
Loading