QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning

Kyunghwan Son; Sungsoo Ahn; Roben D. Delos Reyes; Jinwoo Shin; Yung Yi

QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning

Kyunghwan Son, Sungsoo Ahn, Roben D. Delos Reyes, Jinwoo Shin, Yung Yi

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: multi-agent reinforcement learning

Abstract: QTRAN is a multi-agent reinforcement learning (MARL) algorithm capable of learning the largest class of joint-action value functions up to date. However, despite its strong theoretical guarantee, it has shown poor empirical performance in complex environments, such as Starcraft Multi-Agent Challenge (SMAC). In this paper, we identify the performance bottleneck of QTRAN and propose a substantially improved version, coined QTRAN++. Our gains come from (i) stabilizing the training objective of QTRAN, (ii) removing the strict role separation between the action-value estimators of QTRAN, and (iii) introducing a multi-head mixing network for value transformation. Through extensive evaluation, we confirm that our diagnosis is correct, and QTRAN++ successfully bridges the gap between empirical performance and theoretical guarantee. In particular, QTRAN++ newly achieves state-of-the-art performance in the SMAC environment. The code will be released.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We propose a novel cooperative multi-agent reinforcement learning algorithm with state-of-the-art performance.

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=E3VntBjSDW

14 Replies

Loading