Keywords: multi-agent reasoning, LLM benchmark, zero-shot coordination, theory of mind
Abstract: We study zero-shot coordination (ZSC), where independently trained agents must cooperate with unseen partners at test time. While ZSC is well studied for RL agents, little is known about how LLM-based agents perform in this setting, despite their growing real-world use. Prior work on LLM coordination relies on specialised scaffolding that is unlikely to be adopted by independent agents and tends to overfit narrow tasks (e.g., Overcooked) that are too complex to isolate fundamental coordination abilities. In contrast, we study simple, generic scaffolds in minimal environments from the zero-shot coordination literature. Our experiments show that even in simplified settings, frontier LLM agents fail to coordinate due to a limited understanding of the underlying coordination challenge and poor reasoning about their partner's behaviour. While LLMs enable the use of _semantic information_ for coordination, we find that agents fail to leverage it effectively. Finally, we propose _Coordination Friendly Definitions (CFDs)_ as a principled way to enable robust coordination among LLM agents.
Submission Number: 38
Loading