One by One, Continual Coordinating with Humans via Hyper-Teammate Identification

Cong Guan; Feng Chen; Ke Xue; Chunpeng Fan; Lichao Zhang; Ziqian Zhang; Pengyao Zhao; Zongzhang Zhang; Chao Qian; Lei Yuan; Yang Yu

One by One, Continual Coordinating with Humans via Hyper-Teammate Identification

Cong Guan, Feng Chen, Ke Xue, Chunpeng Fan, Lichao Zhang, Ziqian Zhang, Pengyao Zhao, Zongzhang Zhang, Chao Qian, Lei Yuan, Yang Yu

Published: 05 Nov 2024, Last Modified: 05 Nov 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: One of the primary objectives in modern artificial intelligence researches is to empower agents to effectively coordinate with diverse teammates, particularly human teammates. Previous studies focused on training agents either with a fixed population of pre-generated teammates or through the co-evolution of distinct populations of agents and teammates. However, it is challenging to enumerate all possible teammates in advance, and it is costly, or even impractical to maintain such a sufficiently diverse population and repeatedly interact with previously encountered teammates. Additional design considerations, such as prioritized sampling, are also required to ensure efficient training. To address these challenges and obtain an efficient human-AI coordination paradigm, we propose a novel approach called \textbf{Concord}. Considering that human participants tend to occur in a sequential manner, we model the training process with different teammates as a continual learning framework, akin to how humans learn and adapt in the real world. We propose a mechanism based on hyper-teammate identification to prevent catastrophic forgetting while promoting forward knowledge transfer. Concretely, we introduce a teammate recognition module that captures the identification of corresponding teammates. Leveraging the identification, a well-coordinated AI policy can be generated via the hyper-network. The entire framework is trained in a decomposed policy gradient manner, allowing for effective credit assignment among agents. This approach enables us to train agents with each generated teammate or humans one by one, ensuring that agents can coordinate effectively with concurrent teammates without forgetting previous knowledge. Our approach outperforms multiple baselines in various multi-agent benchmarks, either with generated human proxies or real human participants.

Submission Length: Long submission (more than 12 pages of main content)

Video: https://drive.google.com/file/d/1-E8NqrCbkbNNH4BMskn35m7saONWfIPZ/view?usp=sharing

Code: https://github.com/chenf-ai/Concord

Supplementary Material: zip

Assigned Action Editor: ~DJ_Strouse1

Submission Number: 1813

Loading