A Cooperative-Game-Theoretical Model for Ad Hoc Teamwork

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: ad hoc teamwork, reinforcement learning, cooperative game
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Ad hoc teamwork (AHT) is a cutting-edge problem in the multi-agent systems community, in which our task is to control an agent (`learner’) which is required to cooperate with new teammates without prior coordination. Prior works formulated AHT as a stochastic Bayesian game (SBG), standing by the view of non-cooperative game theory. Follow-up work extended SBG to open team settings and proposed an empirical implementation framework based on GNNs called Graph-based Policy Learning (GPL) to tackle variant team sizes. Although the performance of GPL is convincing, its global Q-value representation is difficult to interpret and, therefore, impedes the potential application to real-world problems. In this work, we introduce a game model called coalitional affinity game (CAG) in cooperative game theory and establish a novel theoretical model named open stochastic Bayesian CAG to describe the process of AHT with open team settings. Based on the theoretical model, we derive the new solution concept that guides the representation of the global Q-value with theoretical guarantees for this setting. We further design a practical algorithm which can easily implement the theoretical results. In experiments, we demonstrate the performance improvement the proposed algorithm over GPL and verify the effectiveness and reasonableness of our theoretical model. The demo of the experiments is available at https://sites.google.com/view/cagpl.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4187
Loading