Probe Into Multi-agent Adversarial Reinforcement Learning through Mean-Field Optimal Control

Ziming Wang; Fengxiang He; Bohan Wang; Die Gan; Dacheng Tao

Probe Into Multi-agent Adversarial Reinforcement Learning through Mean-Field Optimal Control

Ziming Wang, Fengxiang He, Bohan Wang, Die Gan, Dacheng Tao

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Abstract: Multi-agent adversarial reinforcement learning (MaARL) has shown promise in solving adversarial games. However, the theoretical tools for MaARL's analysis is still elusive. In this paper, we take the first step to theoretically understanding MaARL through mean-field optimal control. Specifically, we model MaARL as a mean-field quantitative differential game between two dynamical systems with implicit terminal constraints. Based on the game, we respectively study the optimal solution and the generalization of the fore-mentioned game. First of all, a two-sided extremism principle (TSEP) is then established as a necessary condition for the optimal solution of the game. We further show that TSEP is also sufficient given that the terminal time is sufficiently small. Secondly, based on the TSEP, a generalization bound for MaARL is proposed. The bound does not explicitly rely on the dimensions, norms, or other capacity measures of the model, which are usually prohibitively large in deep learning.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

5 Replies

Loading