Keywords: Multi-Agent Reinforcement Learning, Exogenous Dec-POMDP, Influence-based Coordination
TL;DR: LEICA: learn exogenous influence, weight counterfactual shaping by sensitivity, and get robust, well-coordinated MARL under exogenous shifts.
Abstract: Multi-agent reinforcement learning (MARL) has advanced control of many cooperative multi-agent systems. However, most approaches are trained against a single fixed adversarial strategy, leaving teams fragile to adversarial strategy shifts at test time. To handle such limitations, in this paper, we recast cooperative MARL from a new perspective into an Exogenous Dec-POMDP, separating agent-controllable endogenous and environment-driven exogenous dynamics in order to learn policies that adapt to exogenous shifts while preserving coordination. Our framework is composed of two main components: (i) learning exogenous dynamics and (ii) updating policy with two complementary goals - coordination to achieve high team return and causal influence on future exogenous evolution.
We implement the framework under centralized training with decentralized execution into a practical algorithm, named Learning Exogenous Influence for Coordination and Adaptation (LEICA), and evaluate it on SMAX with distinct train/test adversarial strategies. Experimental results show that our approach drastically improves performance in test time with unseen opponents' strategies while achieving high training-time performance, demonstrating its ability to handle exogenous shift and improve training stability.
Primary Area: reinforcement learning
Submission Number: 24809
Loading