Very Large Scale Multi-Agent Reinforcement Learning with Graph Attention Mean Field

Qianyue Hao

Very Large Scale Multi-Agent Reinforcement Learning with Graph Attention Mean Field

Qianyue Hao

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Multi-agent reinforcement learning, large-scale problems, graph attention, mean field

TL;DR: A multi-agent reinforcement learning method solving very large scale problem by mean-field technique combining graph attention mechanism.

Abstract: With recent advances in reinforcement learning, we have witnessed countless successes of intelligent agents in various domains. Especially, multi-agent reinforcement learning (MARL) is suitable for many real-world scenarios and has vast potential applications. However, typical MARL methods can only handle tens of agents, leaving scenarios with up to hundreds or even thousands of agents almost unexplored. There exist two key challenges in scaling up the number of agents: (1) agent-agent interactions are critical in multi-agent systems while the number of interactions grows quadratically with the number of agents, causing great computational complexity and difficulty in strategies-learning; (2) the strengths of interactions vary among agents and over time, making it difficult to precisely model such interactions. In this paper, we propose the Graph Attention Mean Field (GAT-MF) method, where we convert agent-agent interactions into interactions between each agent and a weighted mean field, greatly reducing the computational complexity. We mathematically prove the correctness of this conversion. We design a graph attention mechanism to automatically capture the different and time-varying strengths of interactions, ensuring the ability of our method to precisely model interactions among the agents. We conduct extensive experiments in both manual and real-world scenarios with up to more than 3000 agents, demonstrating that comparing existing MARL methods, our method reaches superior performance and 9.4 times computational efficiency.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Supplementary Material: zip

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

4 Replies

Loading