Theory of Mind Guided Strategy Adaptation for Zero-Shot Coordination

Published: 19 Dec 2025, Last Modified: 05 Jan 2026AAMAS 2026 FullEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Zero Shot Coordination, Multiagent RL, Theory of Mind
TL;DR: We propose to cooperate with unknown teammates by using theory of mind to model the teammate's intentions, and select the most appropriate specialized best-response agent from a policy library to cooperate with the teammate's inferred intentions
Abstract: A central challenge in multi-agent reinforcement learning is training agents that can adapt to previously unseen teammates in a zero-shot fashion. Prior works in this zero-shot coordination setting often follow a two-stage process, first generating a diverse training pool of partner agents, and then training a best-response ego agent to collaborate effectively with the entire training pool. While many previous works have achieved strong performance by devising better ways to diversify the partner agent pool, there has been less emphasis on how to use this training pool to construct an adaptive agent. One limitation is that the best-response agent may converge to a $\textit{static, generalist}$ policy that performs reasonably well across diverse teammates, rather than learning a more $\textit{adaptive, specialist}$ policy that can better adapt to teammates and achieve higher synergy. To address this, we propose an adaptive ensemble agent that uses Theory-of-Mind-based best-response selection to first infer the teammate's intentions and then chooses the most suitable policy from the ensemble to cooperate with those intentions. We conduct experiments in the Overcooked environment to evaluate zero-shot coordination performance under both fully and partially observable settings. The empirical results demonstrate the superiority of our method over a single best-response baseline.
Area: Coordination, Organisations, Institutions, Norms and Ethics (COINE)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 935
Loading