Keywords: embodied ai, language models, prompt engineering, symbolic planning, pddl
Abstract: We study how far carefully designed language interfaces can take large language models (LLMs) on the Embodied Agent Interface (EAI) benchmark, which decomposes embodied decision making into goal interpretation, subgoal decomposition, action sequencing, and transition modeling in the BEHAVIOR and VirtualHome simulators. Rather than training new models, we keep LLMs fixed and redesign the prompts: we make predicate vocabularies and argument conventions explicit, enforce JSON schemas, and provide few-shot examples; for action sequencing, we additionally introduce a PDDL-based scaffold in which the LLM corrects approximate domain and problem files before either emitting a plan directly or delegating to FastDownward. Our results highlight interface design as a critical degree of freedom for embodied LLMs, even under fixed model weights.
Submission Number: 5
Loading