RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative EvolutionDownload PDF

Anonymous

08 Oct 2022 (modified: 05 May 2023)Submitted to Deep RL Workshop 2022Readers: Everyone
Keywords: Multi-Agent Reinforcement Learning, Evolutionary Algorithm
TL;DR: The first framework that proves EA can assist MARL in achieving better collaboration in complex collaborative tasks.
Abstract: In fully cooperative tasks, multi-agent credit assignment makes the value function approximation difficult, resulting in learning collaboration challenging in Multi-Agent Reinforcement learning (MARL). In contrast, Evolutionary Algorithm (EA), without requiring value function, has been demonstrated to achieve competitive performance with RL and further improve RL in single-agent settings. To develop the potential of EA to further improve MARL, we propose a novel learning framework called MARL with \textbf{R}epresentation Asymmetry and Collaboration Evolution (RACE). Besides the MARL team, RACE maintains an additional population of collaborative teams. RACE decomposes the policies controlling the same member in different teams into the nonlinear shared observation representations and individual linear policy representations, i.e., Representation Asymmetry. The shared observation representations convey useful knowledge to control specific members learned by all teams of the population collectively. Based on the shared representations, each team can be considered as a composition of different policy representations instead of different nonlinear policy networks, which constructs a favorable space for collaboration. To achieve effective collaboration, RACE evolves the population through evolutionary algorithm and provides diverse samples to the MARL team. The MARL team trains based on the diverse samples and injects the optimized team into the population to participate in the evolution. Besides, we design the novel \textit{agent-level} crossover and mutation operations that can be performed to promote team exploration and individual exploration. The experiments in complex continuous control tasks Multi-Agent MuJoCo and discrete micromanipulation control tasks SMAC show that RACE can significantly improve the MARL algorithms. To our knowledge, RACE has demonstrated for the first time that EA can assist MARL in achieving better collaboration in complex collaborative tasks.
0 Replies

Loading