Automatic agent chaining for multimodal task support

Published: 24 Sept 2025, Last Modified: 24 Sept 2025NeurIPS 2025 LLM Evaluation Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Agentic AI, Planning, Task Guidance System
TL;DR: Agentic system for multimodal task assistance dataset and approach
Abstract: The future of human-computer interaction is moving toward systems where Large Language Models (LLMs) act as autonomous agents, capable of self-planning and adapting to complex, domain-specific tasks. However, a significant gap remains in developing agentic architectures that can seamlessly integrate into real-world, multimodal task support systems. We present our initial work on a novel agentic architecture for process task guidance, designed to assist human technicians in complex physical tasks. Our system develops automatic agent chaining features via dynamic planner that recruits specialized agents for task solving. To evaluate this approach, we collected a novel multimodal dataset of human-agent interactions during a toy assembly task and benchmarked our agentic system against a non-agentic baseline. Our findings show that the agentic solution significantly improves response quality and reduces incorrect outputs.
Submission Number: 75
Loading