Cooperative Foraging Behaviour Through Multi-Agent Reinforcement Learning with Graph-Based Communication
Keywords: multi-agent, reinforcement learning, graph neural networks, cooperation, sequential social dilemmas, communication
TL;DR: We develop a novel graph-based communication approach for multi-agent reinforcement learning to promote cooperation in sequential social dilemmas.
Abstract: According to the social intelligence hypothesis, cooperation is considered a key component of intelligence and is required to solve a wide range of problems, from everyday challenges like scheduling meetings to global challenges like mitigating climate change and providing humanitarian aid. Extending the ability for artificial intelligence (AI) to cooperate well is critical as AI becomes more prevalent in our lives. In recent years, multi-agent reinforcement learning (MARL) has emerged as a powerful approach to model and analyse the problem of cooperation among artificial agents. In this paper, we investigate the impact of communication on cooperation among reinforcement learning agents in social dilemmas. These are settings in which the short-term individual interests are in conflict with the long-term collective ones, thus each individual profits from defecting, but the overall group would benefit if everyone cooperates. We particularly focus on a temporally and spatially extended Stag-Hunt-like social dilemma that models animal foraging behaviour using principles from Optimal Foraging Theory. We propose a method for communication that combines a graph-based attention mechanism with deep reinforcement learning methods. Additionally, we examine several facets of communication, including the effects of the communication topology and the communication range. We find that greater cooperative behaviour can be achieved through graph-based communication using reinforcement learning in social dilemmas. Additionally, we find that during foraging, local communication promotes better cooperation than long-distance communication. Finally, we visualise and investigate the learned attention weights and explain how agents process communications from other agents.
1 Reply
Loading