Multi-agent reinforcement learning (MARL) owning to its potent capabilities in complex systems has gained remarkable research attention nowadays, in which collaborative decision-making and control for multi-agent systems is one of the key research focuses. The prevalent learning framework is centralized training with decentralized execution (CTDE), in which the decentralized execution realizes strategy flexibility, and the use of centralized training ensures stationarity and goal consistency while becoming incapable when facing scalability and complexity situations. To address this issue, we follow the concept of distributed training with decentralized execution (DTDE). Decentralization is naturally accompanied by the game during the learning process, which has not been entirely studied in related work, resulting in the constrained strategy combination of MARL. In this paper, we devise a novel approach of differential reward interaction (DRI) with conflict-triggered for the distributed evaluation that enables overall goal consistency through highly efficient local information exchange. With this collaborative learning method, the DRI-based MARL can eliminate the notorious issue of converging to saddle equilibriums of stochastic games. Meanwhile, it possesses provable convergence and is well compatible for general value-based and policy-based algorithms. Experiments in several benchmark scenarios demonstrate that DRIMA realizes collaborative strategy learning with enhanced global goal-achieving.
Keywords: Multi-agent reinforcement learning, Distributed training and execution, Multi-player stochastic game
TL;DR: A collaborative learning approach of conflict-triggered differential reward interaction is proposed for distributed MARL that eliminates saddle equilibriums of stochastic games to achieve enhanced strategy combinations.
Abstract:
Supplementary Material: zip
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9171
Loading