Keywords: multi-agent reinforcement learning, cooperative multi-agent, centralized training with decentralized execution, partial observability, inference
Abstract: Partial observability remains a core challenge in cooperative multi-agent reinforcement learning (MARL), often causing poor coordination and suboptimal policies. We show that state-of-the-art methods fail even in simple settings under partial observability. To address this, we propose LIMARL, a latent-inference framework that augments centralized training with decentralized execution (CTDE) via structured latent representations. LIMARL integrates (i) a state representation module that learns compact global state embeddings, and (ii) a recurrent inference module that enables agents to recover these embeddings from local histories. We provide theoretical analysis on sufficiency and robustness under partial observability. Empirically, LIMARL outperforms strong baselines in diagnostic tasks and challenging SMAC and SMACv2 scenarios, demonstrating better performance and faster convergence. Our results highlight latent inference as an effective and scalable solution for partially observable MARL. An implementation of LIMALR is available at https://github.com/salmakh1/LIMARL.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Fares_Fourati1
Track: Regular Track: unpublished work
Submission Number: 152
Loading