Distributed Reinforcement Learning for Multi-robot Decentralized Collective Construction

Guillaume Sartoretti, Yue Wu, William Paivine, T. K. Satish Kumar, Sven Koenig, Howie Choset

2018 (modified: 08 Jun 2022)DARS 2018Readers: Everyone

Abstract: Inspired by recent advances in single agent reinforcement learning, this paper extends the single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy and critic learning, but decentralized policy execution, in a fully-observable system. We show that the sum of experience of all agents can be leveraged to quickly train a collaborative policy that naturally scales to smaller and larger swarms. We demonstrate the applicability of our method on a multi-robot construction problem, where agents need to arrange simple block elements to build a user-specified structure. We present simulation results where swarms of various sizes successfully construct different test structures without the need for additional training.

0 Replies