Addressing Exogenous Variability in Cooperative Multi-Agent Reinforcement Learning

Seongmin Kim; Woohyeon Byeon; Seungyul Han; Youngchul Sung

Addressing Exogenous Variability in Cooperative Multi-Agent Reinforcement Learning

Seongmin Kim, Woohyeon Byeon, Seungyul Han, Youngchul Sung

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-Agent Reinforcement Learning, Exogenous Dec-POMDP, Influence-based Coordination

TL;DR: LEICA: learn exogenous influence, weight counterfactual shaping by sensitivity, and get robust, well-coordinated MARL under exogenous shifts.

Abstract: Multi-agent reinforcement learning (MARL) has advanced control of many cooperative multi-agent systems. However, most approaches are trained against a single fixed adversarial strategy, leaving teams fragile to adversarial strategy shifts at test time. To handle such limitations, in this paper, we recast cooperative MARL from a new perspective into an Exogenous Dec-POMDP, separating agent-controllable endogenous and environment-driven exogenous dynamics in order to learn policies that adapt to exogenous shifts while preserving coordination. Our framework is composed of two main components: (i) learning exogenous dynamics and (ii) updating policy with two complementary goals - coordination to achieve high team return and causal influence on future exogenous evolution. We implement the framework under centralized training with decentralized execution into a practical algorithm, named Learning Exogenous Influence for Coordination and Adaptation (LEICA), and evaluate it on SMAX with distinct train/test adversarial strategies. Experimental results show that our approach drastically improves performance in test time with unseen opponents' strategies while achieving high training-time performance, demonstrating its ability to handle exogenous shift and improve training stability.

Primary Area: reinforcement learning

Submission Number: 24809

Loading