Group-oriented Cooperation in Multi-Agent Reinforcement Learning

Yifan Zang; Jinmin He; Kai Li; Haobo Fu; QIANG FU; Junliang Xing; Jian Cheng

Group-oriented Cooperation in Multi-Agent Reinforcement Learning

Yifan Zang, Jinmin He, Kai Li, Haobo Fu, QIANG FU, Junliang Xing, Jian Cheng

Published: 01 Feb 2023, Last Modified: 13 Feb 2023ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: MARL, Multi-Agent Reinforcement Learning, Group-wise Learning

TL;DR: We propose an automatic grouping mechanism in cooperative MARL, which dynamically adjusts the grouping of agents as training proceeds and achieves efficient team cooperation by facilitating intra- and inter-group coordination.

Abstract: Grouping is ubiquitous in natural systems and is essential for promoting efficiency in team coordination. This paper introduces the concept of grouping into multi-agent reinforcement learning (MARL) and provides a novel formulation of Group-oriented MARL (GoMARL). In contrast to existing approaches that attempt to directly learn the complex relationship between the joint action-values and individual values, we empower groups as a bridge to model the connection between a small set of agents and encourage cooperation among them, thereby improving the efficiency of the whole team. In particular, we factorize the joint action-values as a combination of group-wise values, which guide agents to improve their policies in a fine-grained fashion. We propose a flexible grouping mechanism inspired by variable selection and sparse regularization to generate dynamic groups and group action-values. We further propose a hierarchical control for policy learning that drives the agents in the same group to specialize in similar policies and possess diversified strategies for various groups. Extensive experiments on a challenging set of StarCraft II micromanagement tasks and Google Research Football scenarios verify our method's effectiveness and learning efficiency. Detailed component studies show how grouping works and enhances performance.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

Supplementary Material: zip

14 Replies

Loading