R-MADDPG for Partially Observable Environments and Limited CommunicationDownload PDF

Published: 28 May 2019, Last Modified: 22 Oct 2023RL4RealLife 2019Readers: Everyone
Keywords: multiagent reinforcement learning
Abstract: There are several real-world tasks that would benefit from applying multiagent reinforcement learning (MARL) algorithms, including the coordination among multiple agents such as self-driving cars or autonomous delivery drones. Real-world conditions are a challenging environment for multiagent systems due to the environment's partially observable, nonstationary nature. Moreover, if agents must share a limited resource (e.g., communication network bandwidth) they must all learn how to coordinate resource use. These aspects make learning very challenging. This paper introduces a deep recurrent multiagent actor-critic framework for handling multiagent coordination under partial observable settings and limited communication. We investigate the recurrency effects on the performance and communication use of a team of agents, and demonstrate that the resulting framework is capable of learning time-dependencies for not only sharing missing observations but also handling resource limitations. It gives rise to different communication patterns among agents, which still perform equivalently well as current multiagent actor-critic methods under fully observable settings.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2002.06684/code)
3 Replies

Loading