Efficient Information Sharing for Training Decentralized Multi-Agent World Models

Xiaoling Zeng; Qi Zhang

Efficient Information Sharing for Training Decentralized Multi-Agent World Models

Xiaoling Zeng, Qi Zhang

Published: 09 May 2025, Last Modified: 05 Sept 2025RLC 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Cooperative multi-agent reinforcement learning, decentralized world models, multi-agent communication

Abstract: World models, which were originally developed for single-agent reinforcement learning, have recently been extended to multi-agent settings. Due to unique challenges in multi-agent reinforcement learning, agents' independently training of their world models often leads to underperforming policies, and therefore existing work has largely been limited to the centralized training framework that requires excessive communication. As communication is key, we ask the question of how the agents should communicate efficiently to train and learn policies from their decentralized world models. We address this question progressively. We first allow the agents to communicate with unlimited bandwidth to identify which algorithmic components would benefit the most from what types of communication. Then, we restrict the inter-agent communication with a predetermined bandwidth limit to challenge the agents to communicate efficiently. Our algorithmic innovations develop a scheme that prioritizes important information to share while respecting the bandwidth limit. The resulting method yields superior sample efficiency, sometimes even over centralized training baselines, in a range of cooperative multi-agent reinforcement learning benchmarks.

Submission Number: 103

Loading