Keywords: oordination, multi-agent reinforcement learning, common knowledge, misspecification
Abstract: Zero-shot coordination (ZSC) is a popular setting for studying the ability of AI agents to coordinate with novel partners. Prior formulations of ZSC make the assumption that the problem setting is common knowledge i.e. each agent has the knowledge of the underlying Dec-POMDP, every agent knows the others have this knowledge, and so on ad infinitum. However, in most real-world situations, different agents are likely to have different models of the (real world) environment, thus breaking this assumption. To address this limitation, we formulate the _noisy zero-shot coordination_ (NZSC) problem, where agents observe different noisy versions of the ground truth Dec-POMDP generated by passing the true Dec-POMDP through a noise model. Only the distribution of the ground truth Dec-POMDPs and the noise model are common knowledge. We show that any noisy ZSC problem can be reformulated as a ZSC problem by designing a meta-Dec-POMDP with an augmented state space consisting of both the ground truth Dec-POMDP and its corresponding state. In our experiments, we analyze various aspects of NZSC and show that achieving good performance in NZSC requires agents to make use of both the noisy observations of ground truth Dec-POMDP, knowledge of each other's noise models and their interactions with the ground truth Dec-POMDP. Through experimental results, we further establish that ignoring the noise in problem specification can result in sub-par ZSC coordination performance, especially in iterated scenarios. On the whole, our work highlights that NZSC adds an orthogonal challenge to traditional ZSC in tackling the uncertainty about the true problem.
Submission Number: 49
Loading