Decentralized Deterministic Multi-Agent Reinforcement Learning

Antoine Grosnit; Desmond Cai; Laura Wynter

Decentralized Deterministic Multi-Agent Reinforcement Learning

Antoine Grosnit, Desmond Cai, Laura Wynter

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: multiagent reinforcement learning, MARL, decentralized actor-critic algorithm

Abstract: Recent work in multi-agent reinforcement learning (MARL) by [Zhang, ICML12018] provided the first decentralized actor-critic algorithm to offer convergence guarantees. In that work, policies are stochastic and are defined on finite action spaces. We extend those results to develop a provably-convergent decentralized actor-critic algorithm for learning deterministic policies on continuous action spaces. Deterministic policies are important in many real-world settings. To handle the lack of exploration inherent in deterministic policies we provide results for the off-policy setting as well as the on-policy setting. We provide the main ingredients needed for this problem: the expression of a local deterministic policy gradient, a decentralized deterministic actor-critic algorithm, and convergence guarantees when the value functions are approximated linearly. This work enables decentralized MARL in high-dimensional action spaces and paves the way for more widespread application of MARL.

One-sentence Summary: We provide a provably-convergent decentralized actor-critic algorithm for learning deterministic reinforcement learning policies on continuous action spaces.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=nwzC9dWvDl

12 Replies

Loading