Abstract: Large language models (LLMs) have demonstrated remarkable performance across various tasks. Their potential to facilitate human coordination with many agents is a promising but largely under-explored area. Such capabilities would be helpful in disaster response, urban planning, and real-time strategy scenarios. In this work, we introduce, first a real-time strategy game benchmark designed to evaluate these abilities and second a novel framework we term hybrid intelligence for vast engagements (HIVE). HIVE empowers a single human to coordinate swarms of up to 2000 agents through a natural language dialog with an LLM. We present promising results on this multiagent benchmark, with our hybrid approach solving tasks, such as coordinating agent movements, exploiting unit weaknesses, leveraging human annotations, and understanding terrain and strategic points. Our findings also highlight critical limitations of current models, including difficulties in processing spatial–visual information and challenges in formulating long-term strategic plans. This work sheds light on the potential and limitations of LLMs in human-swarm coordination, paving the way for future research in this area. The HIVE project page, hive.syrkis.com, includes videos of the system in action.
External IDs:doi:10.1109/tg.2025.3564042
Loading