Communication in Multiagent Reinforcement Learning via Counterfactual Message Value
Abstract: Abstract— Effective communication is pivotal for successful
team collaboration in cooperative multiagent tasks. However,
mainly due to the partially observable nature of the environment,
indiscriminate message requests among teammates may lead to
confusion for individual agents, impeding effective collaboration
and diminishing the overall efficiency of the system. Most
previous work has employed gates or attention mechanisms to
extract relatively important messages. However, these methods
often fail to explicitly assess each message’s value, or the process
of calculating is intricate and convoluted. This may result in
ineffective communication and cause miscoordination in complex
scenarios. To address these challenges, we introduce a novel metric named counterfactual message value (CMV), which quantifies
each message’s contribution to an agent, enabling the effective
elimination of redundant messages. In addition, we present a
practical multiagent reinforcement learning (MARL) algorithm,
termed CMV communication (CMVC), which could effectively
facilitate the learning of agent–agent communication protocols.
It predicts the CMVs of teammates for an agent solely based on its
local observation, enabling the agent to initiate communication
with those exhibiting positive CMVs. To differentiate message
impact levels, we design a message aggregator in CMVC that
aggregates messages based on their individual CMVs. We evaluate CMVC in various tasks, including cooperative navigation,
predator–prey, and the StarCraft multiagent challenge (SMAC).
The results indicate that CMVC is prominent in reducing
redundant messages and has the ability to learn more advanced
collaboration strategies, in contrast to several existing state-ofthe-art methods.
Loading