Team NUS-LIDG Technical Report: Embodied Agent Interface Challenge @ NeurIPS 2025

T. Duy Nguyen-Hien; Wee Sun Lee

Team NUS-LIDG Technical Report: Embodied Agent Interface Challenge @ NeurIPS 2025

T. Duy Nguyen-Hien, Wee Sun Lee

30 Nov 2025 (modified: 07 Dec 2025)NeurIPS 2025 Workshop FMEA SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: embodied ai, language models, prompt engineering, symbolic planning, pddl

Abstract: We study how far carefully designed language interfaces can take large language models (LLMs) on the Embodied Agent Interface (EAI) benchmark, which decomposes embodied decision making into goal interpretation, subgoal decomposition, action sequencing, and transition modeling in the BEHAVIOR and VirtualHome simulators. Rather than training new models, we keep LLMs fixed and redesign the prompts: we make predicate vocabularies and argument conventions explicit, enforce JSON schemas, and provide few-shot examples; for action sequencing, we additionally introduce a PDDL-based scaffold in which the LLM corrects approximate domain and problem files before either emitting a plan directly or delegating to FastDownward. Our results highlight interface design as a critical degree of freedom for embodied LLMs, even under fixed model weights.

Submission Number: 5

Loading