R-MADDPG for Partially Observable Environments and Limited Communication

Rose E. Wang; Michael Everett; Jonathan P. How

R-MADDPG for Partially Observable Environments and Limited Communication

Rose E. Wang, Michael Everett, Jonathan P. How

Published: 28 May 2019, Last Modified: 13 Apr 2025RL4RealLife 2019Readers: Everyone

Keywords: multiagent reinforcement learning

Abstract: There are several real-world tasks that would benefit from applying multiagent reinforcement learning (MARL) algorithms, including the coordination among multiple agents such as self-driving cars or autonomous delivery drones. Real-world conditions are a challenging environment for multiagent systems due to the environment's partially observable, nonstationary nature. Moreover, if agents must share a limited resource (e.g., communication network bandwidth) they must all learn how to coordinate resource use. These aspects make learning very challenging. This paper introduces a deep recurrent multiagent actor-critic framework for handling multiagent coordination under partial observable settings and limited communication. We investigate the recurrency effects on the performance and communication use of a team of agents, and demonstrate that the resulting framework is capable of learning time-dependencies for not only sharing missing observations but also handling resource limitations. It gives rise to different communication patterns among agents, which still perform equivalently well as current multiagent actor-critic methods under fully observable settings.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/r-maddpg-for-partially-observable/code)

3 Replies

Loading