Multi-Agent Reinforcement Learning in Stochastic Networked Systems

Yiheng Lin; Guannan Qu; Longbo Huang; Adam Wierman

Multi-Agent Reinforcement Learning in Stochastic Networked Systems

Yiheng Lin, Guannan Qu, Longbo Huang, Adam Wierman

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 PosterReaders: Everyone

Keywords: Multi-agent reinforcement learning, networks, state aggregation, actor-critic

Abstract: We study multi-agent reinforcement learning (MARL) in a stochastic network of agents. The objective is to find localized policies that maximize the (discounted) global reward. In general, scalability is a challenge in this setting because the size of the global state/action space can be exponential in the number of agents. Scalable algorithms are only known in cases where dependencies are static, fixed and local, e.g., between neighbors in a fixed, time-invariant underlying graph. In this work, we propose a Scalable Actor Critic framework that applies in settings where the dependencies can be non-local and stochastic, and provide a finite-time error bound that shows how the convergence rate depends on the speed of information spread in the network. Additionally, as a byproduct of our analysis, we obtain novel finite-time convergence results for a general stochastic approximation scheme and for temporal difference learning with state aggregation, which apply beyond the setting of MARL in networked systems.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

TL;DR: A scalable multi-agent reinforcement learning algorithm with provable convergence rates in a stochastic network.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/multi-agent-reinforcement-learning-in/code)

10 Replies

Loading