Towards Cost-Efficient Federated Multi-Agent Reinforcement Learning with Learnable Aggregation

Yi Zhang; Sen Wang; Zhi Chen; Xuwei Xu; Stano Funiak; Frank de Hoog; Jiajun Liu

Towards Cost-Efficient Federated Multi-Agent Reinforcement Learning with Learnable Aggregation

Yi Zhang, Sen Wang, Zhi Chen, Xuwei Xu, Stano Funiak, Frank de Hoog, Jiajun Liu

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: Multi-agent Reinforcement Learning; Federated Reinforcement Learning; Multi-agent System

TL;DR: We propose a federated multi-agent reinforcement learning method which employs asynchronous critics and a centralized aggregator to enable coordination and the optimization of communication efficiency.

Abstract:

Multi-agent reinforcement learning (MARL) often adopts centralized training with a decentralized execution (CTDE) framework to facilitate cooperation among agents. When it comes to deploying MARL algorithms in real-world scenarios, CTDE requires gradient transmission and parameter synchronization for each training step, which can incur disastrous communication overhead. To enhance communication efficiency, federated MARL is proposed to average the gradients periodically during communication. However, such straightforward averaging leads to poor coordination and slow convergence arising from the non-i.i.d. problem which is evidenced by our theoretical analysis. To address the two challenges, we propose a federated MARL framework, termed cost-efficient federated multi-agent reinforcement learning with learnable aggregation (FMRL-LA). Specifically, we use asynchronous critics to optimize communication efficiency by filtering out redundant local updates based on the estimation of agent utilities. A centralized aggregator rectifies these estimations conditioned on global information to improve cooperation and reduce non-i.i.d. impact by maximizing the composite system objectives. For a comprehensive evaluation, we re-create a federated multi-agent autonomous driving environment based on MetaDrive. Our findings indicate that FMRL-LA can outperform other baselines by at least 5% with respect to the system utility on average.

Supplementary Material: zip

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6988

Loading