Aligning LLM-Generated Tasks with Physical Executability in Grounded Environments

ACL ARR 2026 January Submission5134 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Constraint-Aware Generation, Execution-Grounded Constraints
Abstract: Large language models (LLMs) are increasingly used to generate task instructions for grounded agents, yet linguistic fluency does not guarantee physical or operational feasibility. We reveal an executability gap: LLMs often produce instructions that sound plausible but violate environment constraints (e.g., nonexistent entities or invalid preconditions). Mechanistically, we observe a layer-wise preference shift where early representations favor grounded candidates, while deeper layers increasingly promote linguistically coherent but less constrained continuations. We then introduce a constraint-aware evaluation protocol with execution-verifiable constraints, and propose a training-free generation-time intervention that injects execution-aware constraints without retraining the downstream agent. Across code, tool-use, and embodied benchmarks, this simple adjustment consistently improves executability and task success (e.g., VirtualHome executability 41.3→46.0, correctness 42.6→47.0). Our results suggest that aligning generation with environment-verifiable constraints is a key bottleneck for grounded task generation.
Paper Type: Long
Research Area: Safety and Alignment in LLMs
Research Area Keywords: Constraint-Aware Generation,Execution-Grounded Constraints
Languages Studied: English
Submission Number: 5134
Loading