Track: Full track
Keywords: zero-shot coordination, procedural environment generation, multi-agent interactions
TL;DR: Learning to coordinate with a single partner on multiple problems helps you coordinate with novel partners on novel problems
Abstract: Zero-shot coordination (ZSC) is an important challenge for developing adaptable AI systems that are capable of collaborating with humans in unfamiliar tasks. While prior work has mainly focused on adapting to new partners, generalizing cooperation across different environments is equally important. This paper investigates training AI agents in self-play (SP) to achieve zero-shot collaboration with novel partners in novel tasks. We introduce a new Jax-based, procedurally generated environment for multi-agent reinforcement learning, Infinite Kitchen. Our rule-based generator creates billions of solvable kitchen configurations that enable the training of a single, generalizable agent that can adapt to new levels. Our results show that exposure to diverse levels in self-play consistently improves generalization to new partners, with graph neural network (GNN) based architectures achieving the highest performance across many layouts. Our findings suggest that learning to collaborate across a multitude of unique scenarios encourages agents to develop maximally general norms, which prove highly effective for collaboration with different partners when combined with appropriate inductive biases.
Submission Number: 43
Loading