AgentMixer: Multi-Agent Correlated Policy Factorization

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Multi-Agent Reinforcement Learning, Correlated Equilibrium
Abstract: Centralized training with decentralized execution (CTDE) has been popularly employed to stabilize the partially observable multi-agent reinforcement learning (MARL) by learning a centralized value function. However, existing methods typically assume that agents make decisions based on their local observation independently, which could hardly lead to a correlated joint policy with sufficient coordination. In this paper, we propose AgentMixer which fully takes advantage of CTDE to learn correlated decentralized policies. Specifically, AgentMixer first explicitly models the correlated joint policy by a module named \textit{Policy Modifier} composing the partially observable individual policies conditioned on global state information. To overcome the mismatch problem caused by the asymmetric information when distilling the state-based joint policy into partially observable decentralized policies, we introduce \textit{Individual-Global-Consistency} (IGC) to maintain the mode consistent between them. The incorporation of these two novel modules enables learning correlated decentralized policies with restricted partial observability. We further theoretically prove that AgentMixer converges to $\epsilon$-approximate Correlated Equilibrium. The strong experimental performance on three MARL benchmarks also confirms the effectiveness of our method.
Supplementary Material: zip
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7899
Loading