Keywords: Multi-Agent RL, Learn to Communicate, Common Operating Picture
Abstract: In multi-agent systems, agents often possess only limited observations of the environment. Communication between teammates becomes crucial for enhancing coordination. Past research has primarily focused on encoding local information into embedding messages which are unintelligible to humans. We find that combining these messages in agent's policy learning leads to brittle policies when tested on out-of-distribution initial states. We present an approach to multi-agent coordination, where each agent is equipped with the capability to integrate its (history of) observations and messages received into a unified Common Operating Picture (COP), a well-known construct in human teams. This process takes into account the dynamic nature of the environment and the shared mission.
We conducted experiments in the StarCraft2 environment to validate our approach. Our results demonstrate the efficacy of COP integration, and show that COPs can directly lead to robust policies with superior performance compared to state-of-the-art Multi-Agent Reinforcement Learning (MARL) methods. Notably, our approach excels in generalization when faced with out-of-distribution initial states.
Submission Number: 28
Loading