Towards Complete Expressiveness Capacity of Mixed Multi-Agent Q Value Function

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Multi-Agent, Reinforcement Learning, Value Decomposition
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Value decomposition is an efficient approach to achieving centralized training with decentralized execution in fully cooperative Multi-Agent Reinforcement Learning (MARL) problems. Recently, Strictly Monotonic Mixing Function (SMMF) has gained widespread application in value decomposition methods, but SMMF could suffer from convergence difficulties for the representational limitation. This paper investigates the circumstances under which the representational limitation occurs and presents approaches to overcome it. We begin our investigation with Linear Mixing Function (LMF), a simple case of SMMF. Firstly, we prove that LMF is free from representational limitation only in a rare case of MARL problems. Secondly, we propose a two-stage mixing framework, which includes a difference rescaling stage after SMMF to complete the representational capability. However, the capacity could remain unrealized for the cross interference between the representation of different action-values. Finally, we introduce gradient shaping to address this problem. The experimental results validate the expressiveness of LMF and demonstrate the effectiveness of our proposed methods.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7786
Loading