A Multi-Agent Reinforcement Learning Algorithm Embedded with Opponent Modeling

Shun Li, Xiao Su, Bing Shi

Published: 01 Jan 2024, Last Modified: 15 May 2025ECAI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In multi-agent systems, how to cooperate with each other to jointly confront opponent agents is a very important issue (i.e., multi-agent cooperative confrontation problem). In recent years, multi-agent reinforcement learning has developed rapidly and become a general paradigm for solving multi-agent decision-making problems, so it has naturally been used to solve multi-agent cooperative confrontation problems. However, existing works usually focus on the cooperation among multiple agents, making it difficult for agents to adapt to highly dynamic adversarial environments. To address this issue, in this paper, we propose a multi-agent reinforcement learning algorithm embedded with opponent modeling (MARLeOM). The opponent modeling module constructs multi-level opponent models using the environment model and recursive reasoning, and then mixes multi-level opponent models to enhance the representational capability. The multi-agent reinforcement learning part adopts CTDE mechanism and Actor-Critic framework. In the centralized training phase, Critic can obtain the observation and action information of all agents to guide the learning of Actor; in the decentralized execution phase, Actor makes decisions based on local information, which includes its own observation and the predicted opponent action based on the opponent’s observation using the opponent modeling module. The combination of opponent modeling and multi-agent reinforcement learning enables agents to learn optimal cooperative confrontation methods. Empirical experiments on multiple cooperative adversarial tasks demonstrate that MARLeOM can achieve more effective adaptation and better performance than baseline methods.