DiZCo: Planning Zero-Shot Coordination in World Models

ICLR 2026 Conference Submission21052 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: human-AI cooperation, generative modeling, multi-agent planning
TL;DR: We introduce a framework that addresses the human-AI collaboration by by combining environmental and partner modeling with test-time search.
Abstract: Developing intelligent agents capable of seamlessly cooperating and coordinating with other agents in shared environments, including humans, has become a critical research challenge in the field of AI. This requires agents to understand environment dynamics and anticipate other agents responses' to each action. Current research approaches Human-AI coordination through model-free policies for optimality and population-based training for robustness. However, these approaches are brittle and can fail when collaborating with people due to the diverse and unpredictable nature of human behavior, which cannot be comprehensively captured by the training distribution. Striving for a solution that balances robustness and optimality, we introduce **DiZCo**, the first framework that leverages generative models to enable real-time, search-based planning in a complex human-AI cooperative task. We first train a generative model to predict future world trajectories conditioned on current state, ego actions, and partner strategy based on their identity, serving as our world model. Then we train a generative action proposer that proposes plausible ego action candidates based on the world state. At test time, we identify the optimal future trajectory by searching through outcomes of all proposed action candidates passed into our world model. Offline evaluations indicate that the DiZCo framework outperforms state-of-the-art model-free policies in joint reward. To validate that this method can be feasible for real-time human interaction, we engineer a system that enables model-based planning and search to operate at speeds fast enough to cooperative live with humans. A preliminary user study resulted in positive feedback, collectively underscoring its practical effectiveness for real-time human-AI collaboration.
Primary Area: generative models
Submission Number: 21052
Loading