Boosting Multi-Agent Reinforcement Learning via Transition-Informed Representations

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: SSL;MARL
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Effective coordination among agents in a multi-agent system necessitates an understanding of the underlying dynamics of the environment. However, in the context of multi-agent reinforcement learning (MARL), agent partially observed information leads to a lack of consideration for agent interactions and coordination from an ego perspective under the world model, which becomes the main obstacle to improving the data efficiency of MARL methods. To address this, motivated by the success of learning a world model in RL and cognitive science, we devise a world-model-driven learning paradigm enabling agents to gain a more holistic representation of individual observation of the environment. Specifically, we present the Transition-Informed Multi-Agent Representations (TIMAR) framework, which leverages the joint transition model, i.e., the surrogate world model, to learn effective representations among agents through a self-supervised learning objective. TIMAR incorporates an auxiliary module to predict future transitions based on sequential observations and actions, allowing agents to infer the latent state of the system and consider the influences of others. Experimental evaluation of TIMAR in various MARL environments demonstrates its significantly improved performance and data efficiency compared to strong baselines such as MAPPO, HAPPO, finetuned QMIX, MAT, and MA2CL. In addition, we found TIMAR can also improve the robustness and generalization of the Transformer-based MARL algorithm such as MAT.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7799
Loading