Keywords: multi-agent, LLM agents, collaboration, grounded interaction, embodied
TL;DR: We introduce MINDcraft a popular open source framework (4.3k github stars) for playing Minecraft as well as MineCollab, benchmark for testing human-AI collaboration in Minecraft.
Abstract: Collaboration lies at the heart of human intelligence—whether brainstorming ideas, dividing responsibilities, or planning complex tasks together.
Can large language models (LLMs) do the same?
We introduce \mindcraft, a dynamic platform that pushes the limits of AI collaboration by combining real-time, adaptive communication with 47 powerful in-game tools that let agents act in the rich, open world of Minecraft.
Alongside it, we present \minecollab, a benchmark for evaluating how well agents coordinate, plan, and execute tasks together.
Our experiments reveal a striking result: LLM agents falter when collaboration demands clear and detailed communication—showing up to a 15\% performance drop when they must articulate step-by-step plans.
These findings highlight that while today’s agents can act, true collaboration still hinges on mastering language as a medium for shared understanding and joint reasoning.
Video demonstrations illustrating the capabilities and failure modes of our agents can be found here: \url{https://mindcraft-minecollab.github.io/index.html}
Submission Type: Research Paper (4-9 Pages)
Submission Number: 80
Loading