GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning

Haoyuan Qin; Zhengzhu Liu; Chenxing Lin; Chennan Ma; Songzhu Mei; Siqi Shen; Cheng Wang

GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning

Haoyuan Qin, Zhengzhu Liu, Chenxing Lin, Chennan Ma, Songzhu Mei, Siqi Shen, Cheng Wang

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We studied the futile neurons in multi-agent reinforcement learning and proposed a new parameter sharing method for resolving the futile neurons.

Abstract: Parameter-sharing (PS) techniques have been widely adopted in cooperative Multi-Agent Reinforcement Learning (MARL). In PS, all the agents share a policy network with identical parameters, which enjoys good sample efficiency. However, PS could lead to homogeneous policies that limit MARL performance. We tackle this problem from the angle of gradient conflict among agents. We find that the existence of futile neurons whose update is canceled out by gradient conflicts among agents leads to poor learning efficiency and diversity. To address this deficiency, we propose GradPS, a gradient-based PS method. It dynamically creates multiple clones for each futile neuron. For each clone, a group of agents with low gradient-conflict shares the neuron's parameters. Our method can enjoy good sample efficiency by sharing the gradients among agents of the same clone neuron. Moreover, it can encourage diverse behaviors through independently updating an exclusive clone neuron. Through extensive experiments, we show that GradPS can learn diverse policies with promising performance. The source code for GradPS is available in \url{https://github.com/xmu-rl-3dv/GradPS}.

Lay Summary: We studied how parameter sharing influences multiple AI agents working together. When all agents use the same network, it might limit their ability to develop different strategies. We analyzed the issue by examining how network updates (gradients) work. We found that some neurons become "futile" because different agents try to update them in conflicting ways. Our paper introduces GradPS, a new method that improves performance by fixing these "futile" neurons. This provides a fresh way to optimize other parameter-sharing techniques.

Link To Code: https: //github.com/xmu-rl-3dv/GradPS

Primary Area: Reinforcement Learning->Multi-agent

Keywords: multi agent reinforcement learning; parameter sharing; gradient conflict

Submission Number: 10830

Loading