Decentralized Asymmetric DQN: Decentralisation without factorisation in Multi Agent Reinforcement Learning

18 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: multi-agent reinforcement learning, value factorisation methods
Abstract: Cooperative multi-agent reinforcement learning (MARL) poses significant challenges because each agent must simultaneously learn effective representations of their own behaviors and those of others in the system. Value factorization methods that use centralized training with decentralized execution (CTDE) offer a way to encode such behaviours within a central network during learning, while relying solely on decentralized policies at deployment. However, existing value factorization approaches often have significant limitations: Independent Q-Learning (IQL) suffers from instability, while methods such as VDN and QMIX restrict the solution space through the Individual-Global-Max (IGM) condition. In this work, we introduce Decentralized Asymmetric DQN (Dec-ADQN), a general method that is able to offer decentralization without factorization, therefore not relying on maintaining the IGM constraint. We evaluate Dec-ADQN on widely used benchmarks, including OvercookedV2 and SMAX, implemented in JaxMARL. Our results show that despite its broader generality, Dec-ADQN achieves performance on par with, and often surpassing, prior methods, demonstrating both robustness and flexibility in cooperative MARL.
Primary Area: reinforcement learning
Submission Number: 10014
Loading