GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning
TL;DR: We studied the futile neurons in multi-agent reinforcement learning and proposed a new parameter sharing method for resolving the futile neurons.
Abstract: Parameter-sharing (PS) techniques have been widely adopted in cooperative Multi-Agent Reinforcement Learning (MARL). In PS, all the agents share a policy network with identical parameters, which enjoys good sample efficiency. However, PS could lead to homogeneous policies that limit MARL performance. We tackle this problem from the angle of gradient conflict among agents. We find that the existence of futile neurons whose update is canceled out by gradient conflicts among agents leads to poor learning efficiency and diversity. To address this deficiency, we propose GradPS, a gradient-based PS method. It dynamically creates multiple clones for each futile neuron. For each clone, a group of agents with low gradient-conflict shares the neuron's parameters.
Our method can enjoy good sample efficiency by sharing the gradients among agents of the same clone neuron. Moreover, it can encourage diverse behaviors through independently updating an exclusive clone neuron. Through extensive experiments, we show that GradPS can learn diverse policies with promising performance. The source code for GradPS is available in \url{https://github.com/xmu-rl-3dv/GradPS}.
Lay Summary: We studied how parameter sharing influences multiple AI agents working together. When all agents use the same network, it might limit their ability to develop different strategies.
We analyzed the issue by examining how network updates (gradients) work. We found that some neurons become "futile" because different agents try to update them in conflicting ways.
Our paper introduces GradPS, a new method that improves performance by fixing these "futile" neurons. This provides a fresh way to optimize other parameter-sharing techniques.
Link To Code: https: //github.com/xmu-rl-3dv/GradPS
Primary Area: Reinforcement Learning->Multi-agent
Keywords: multi agent reinforcement learning; parameter sharing; gradient conflict
Submission Number: 10830
Loading