Bi-Level Dynamic Parameter Sharing among Individuals and Teams for Promoting Collaborations in Multi-Agent Reinforcement LearningDownload PDF

Anonymous

22 Sept 2022, 12:35 (modified: 26 Oct 2022, 14:07)ICLR 2023 Conference Blind SubmissionReaders: Everyone
Keywords: Multi-Agent Reinforcement Learning, Reinforcement Learning
TL;DR: We propose a bi-level dynamic parameter sharing mechanism between individuals and teams, which can not only promote agents to learn diversified strategies, but also promote agents to form more stable and complementary cooperative relationships.
Abstract: Parameter sharing has greatly contributed to the success of multi-agent reinforcement learning in recent years. However, most existing parameter sharing mechanisms are static, and parameters are indiscriminately shared among individuals, ignoring the dynamic environments and different roles of multiple agents. In addition, although a single-level selective parameter sharing mechanism can promote the diversity of strategies, it is hard to establish complementary and cooperative relationships between agents. To address these issues, we propose a bi-level dynamic parameter sharing mechanism among individuals and teams for promoting effective collaborations (BDPS). Specifically, at the individual level, we define virtual dynamic roles based on the long-term cumulative advantages of agents and share parameters among agents in the same role. At the team level, we combine agents of different virtual roles and share parameters of agents in the same group. Through the joint efforts of these two levels, we achieve a dynamic balance between the individuality and commonality of agents, enabling agents to learn more complex and complementary collaborative relationships. We evaluate BDPS on a challenging set of StarCraft II micromanagement tasks. The experimental results show that our method outperforms the current state-of-the-art baselines, and we demonstrate the reliability of our proposed structure through ablation experiments.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)
4 Replies

Loading