Big Picture Thinking: Enhance Multi-Agent Imitation Learning through Global Dependencies

20 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Multi-agent reinforcement learning, Generative adversarial imitation learning, Complex dependency, Joint distribution matching, Transformer
Abstract: Multi-agent reinforcement learning (MARL) has emerged as a promising approach for solving complex problems involving multi-agent collaboration or competition. Recently, researchers have turned to imitation learning to avoid the explicit design of intricate reward functions in MARL. By formulating the problem as a distribution-matching task based on expert trajectories, imitation learning enables agents to continually approximate expert policies without requiring manual reward engineering. However, classical multi-agent imitation learning frameworks, such as MAGAIL, often treat individual agent's distribution matching independently, disregarding the intricate dependencies that arise from agent cooperation. This neglect results in inaccurate estimations of action-value functions, weak feedback from the discriminator, and a significant vanishing gradient problem. This paper proposed a novel multi-agent joint distribution matching framework based on the Transformer architecture. It explicitly models global dependencies among agents within the generator and discriminator components sequentially and autoregressively. We also theoretically prove the effectiveness of this framework in enhancing reward variance and advantage gradient. Extensive experiments demonstrated the remarkable performance improvements achieved by our proposed method on various benchmarks.
Supplementary Material: zip
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2525
Loading