Keywords: Large Language Models, Embodied Agents, Task Planning, Reinforcement Learning
TL;DR: We solve "Signal Confounding" in LLM planners with DDCG, a framework using decoupled Feasibility and Quality critics to provide clear guidance and improve success rates in embodied tasks.
Abstract: Large Language Models (LLMs) have endowed embodied agents with unprecedented high-level planning capabilities. However, grounding abstract language plans into the physical world remains a significant challenge. Although feedback-based closed-loop systems are the dominant paradigm, we identify a critical bottleneck: Signal Confounding. Current feedback mechanisms fail to distinguish between physically infeasible errors, which arise from violating physical rules, and strategically sub-optimal choices. This ambiguity severely hinders effective plan correction. To address this, we propose Decoupled Dual-Critic Guidance (DDCG), a framework that guides planning by providing two independent and explicit feedback signals. DDCG utilizes: a Feasibility Critic ($C_F$) to judge whether an action is physically compliant, and a Quality Critic ($C_Q$) to evaluate the strategic value of an action, conditioned on its feasibility. This decoupled guidance enables the LLM planner to perform precise error attribution, leading to better decision-making. Theoretically, DDCG can be viewed as a form of guided planning under dual-critic constraints: the Feasibility Critic defines a hard safety boundary, while the Quality Critic provides a reward gradient for guidance within it. This allows for more effective planning without requiring expensive parameter updates. Extensive experiments on the embodied benchmark VirtualHome demonstrate that DDCG significantly improves both task success rate and plan executability, establishing a more robust new paradigm for the LLM grounding problem.
Submission Type: Research Paper (4-9 Pages)
Submission Number: 44
Loading