Keywords: model-based reinforcement learning, multi-agent reinforcement learning
Abstract: World models enable learning policies via latent imagination, offering benefits such as history compression and sample efficiency.
The primary challenge in applying world models to multi-agent tasks is that modeling multi-agent dynamics in latent space requires integrating information from different agents, often creating spurious correlations between their latent states.
Existing methods either reconstruct the observation for each agent or employ communication to maintain correlation during execution, failing to learn disentangled latent states that are crucial for effective decentralized control.
To address this, we present the Disentangled Multi-Agent World Model (DMAWM). The framework facilitates learning decentralized policies in the latent space through a novel architecture comprising independent agent modules and a shared environment module.
During real-environment execution, agent modules independently process local information to form a factorized latent representation.
The environment module is then trained to mirror the factorized structure generated by the agent modules, effectively disentangling individual latent states from the interaction dynamics.
Consequently, imaginary rollouts generated by the environment module more faithfully simulate decentralized execution dynamics, facilitating the transfer of policies from imagination to decentralized execution.
On three multi-agent reinforcement learning (MARL) benchmarks with both vector and visual observations, DMAWM outperforms existing model-based and model-free approaches in convergence speed and final performance, with additional visualization demonstrating its efficacy in capturing agent interactions.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 22245
Loading