- Keywords: Multi-agent Reinforcement Learning, Predictive State Representation, Dynamic Interaction Graph
- Abstract: This paper proposes a new algorithm for learning the optimal policies under a novel multi-agent predictive state representation reinforcement learning model. Compared to the state-of-the-art methods, the most striking feature of our approach is the introduction of a dynamic interaction graph to the model, which allows us to represent each agent's predictive state by considering the behaviors of its ``neighborhood'' agents. Methodologically, we develop an online algorithm that simultaneously learns the predictive state representation and agent policies. Theoretically, we provide an upper bound of the $L_2$-norm of the learned predictive state representation. Empirically, to demonstrate the efficacy of the proposed method, we provide thorough numerical results on both a MAMuJoCo robotic learning experiment and a multi-agent particle learning environment.
- One-sentence Summary: We propose a new algorithm for MARL under a multi-agent predictive state representation model, where we incorporate a dynamic interaction graph; we provide the theoretical guarantees of our model and run various experiments to support our algorithm.
- Supplementary Material: zip