Latent Inference for Effective Multi-Agent Reinforcement Learning under Partial Observability

Salma Kharrat; Fares Fourati; Marco Canini; Mohamed-Slim Alouini; Vaneet Aggarwal

Latent Inference for Effective Multi-Agent Reinforcement Learning under Partial Observability

Salma Kharrat, Fares Fourati, Marco Canini, Mohamed-Slim Alouini, Vaneet Aggarwal

Published: 17 Jul 2025, Last Modified: 07 Oct 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: multi-agent reinforcement learning, cooperative multi-agent, centralized training with decentralized execution, partial observability, inference

Abstract: Partial observability remains a core challenge in cooperative multi-agent reinforcement learning (MARL), often causing poor coordination and suboptimal policies. We show that state-of-the-art methods fail even in simple settings under partial observability. To address this, we propose LIMARL, a latent-inference framework that augments centralized training with decentralized execution (CTDE) via structured latent representations. LIMARL integrates (i) a state representation module that learns compact global state embeddings, and (ii) a recurrent inference module that enables agents to recover these embeddings from local histories. We provide theoretical analysis on sufficiency and robustness under partial observability. Empirically, LIMARL outperforms strong baselines in diagnostic tasks and challenging SMAC and SMACv2 scenarios, demonstrating better performance and faster convergence. Our results highlight latent inference as an effective and scalable solution for partially observable MARL. An implementation of LIMALR is available at https://github.com/salmakh1/LIMARL.

Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.

Serve As Reviewer: ~Fares_Fourati1

Track: Regular Track: unpublished work

Submission Number: 152

Loading