Abstract: Collaboration is ubiquitous and essential in day-to-day life—from exchanging
ideas, to delegating tasks, to generating plans together. This work studies how
LLMs can adaptively collaborate to perform complex embodied reasoning tasks.
To this end we introduce MINDcraft, an easily extensible platform built to enable LLM agents to control characters in the open-world game of Minecraft; and
MineCollab, a benchmark to test the different dimensions of embodied and collaborative reasoning2
. An experimental study finds that the primary bottleneck
in collaborating effectively for current state-of-the-art agents is efficient natural
language communication, with agent performance dropping as much as 15% when
they are required to communicate detailed task completion plans. We conclude that
existing LLM agents are ill-optimized for multi-agent collaboration, especially in
embodied scenarios, and highlight the need to employ methods beyond in-context
and imitation learning.
Loading