Large language models (LLMs) have shown promise in enhancing reinforcement learning (RL) through task decomposition, yet their generated subgoals often lack reliability, leading to inefficient exploration and suboptimal policy learning. In this paper, we propose LLMV-AgE (Verification of LLM-guided planning for Agentic Exploration), an RL framework that integrates LLM-guided subgoal planning with a hierarchical verification process to ensure both semantic validity and environmental feasibility. LLMV-AgE systematically assesses subgoal coherence, corrects invalid plans through iterative refinement, and aligns policy learning with reliable, goal-driven objectives. Empirical results on the procedurally generated Crafter benchmark demonstrate that LLMV-AgE significantly improves exploration efficiency and policy robustness by mitigating the impact of hallucinated subgoals and guiding agents toward more achievable goals.
Keywords: Reinforcement Learning; Task Planning; Large Language Models
Abstract:
Submission Number: 30
Loading