A Survey of Multi-agent Reinforcement Learning from Game Theoretical Perspective

TMLR Paper67 Authors

28 Apr 2022 (modified: 28 Feb 2023)Withdrawn by AuthorsEveryoneRevisionsBibTeX
Abstract: Following the remarkable success of the AlphaGO series, 2019 was a booming year that witnessed significant advances in multi-agent reinforcement learning (MARL) techniques. MARL corresponds to the learning problem in a multi-agent system in which multiple agents learn simultaneously. It is an interdisciplinary domain with a long history that includes game theory, machine learning, stochastic control, psychology, and optimisation. Although MARL has achieved considerable empirical success in solving real-world games, there is a lack of a self-contained overview in the literature that elaborates the game theoretical foundations of modern MARL methods and summarises the recent advances. In fact, the majority of existing surveys are outdated and do not fully cover the recent developments since 2010. In this work, we provide a monograph on MARL that covers both the fundamentals and the latest developments in the research frontier. This survey is separated into two parts. From §1 to §4, we present the self-contained fundamental knowledge of MARL, including problem formulations, basic solutions, and existing challenges. Specifically, we present the MARL formulations through two representative frameworks, namely, stochastic games and extensive-form games, along with different variations of games that can be addressed. The goal of this part is to enable the readers, even those with minimal related background, to grasp the key ideas in MARL research. From §5 to §9, we present an overview of recent developments of MARL algorithms. Starting from new taxonomies for MARL methods, we conduct a survey of previous survey papers. In later sections, we highlight several modern topics in MARL research, including Q-function factorisation, multi-agent soft learning, networked multi-agent MDP, stochastic potential games, zero-sum continuous games, online MDP, turn-based stochastic games, policy space response oracle, approximation methods in general-sum games, and mean-field type learning in games with infinite agents. Within each topic, we select both the most fundamental and cutting-edge algorithms. The goal of our survey is to provide a self-contained assessment of the current state-of-the-art MARL techniques from a game theoretical perspective. We expect this work to serve as a stepping stone for both new researchers who are about to enter this fast-growing domain and existing domain experts who want to obtain a panoramic view and identify new directions based on recent advances.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Matthieu_Geist1
Submission Number: 67
Loading