Keywords: Planning with Language Models, Reinforcement Learning
Abstract: Despite their strong reasoning capabilities and extensive world knowledge, Large Language Models (LLMs) often generate plans that ignore or violate task constraints, limiting their reliability in real-world planning scenarios. This stems from their limited ability to systematically incorporate constraint information during generation. Existing methods typically rely on external tools or task decomposition strategies, but do not improve the model’s intrinsic awareness or understanding of constraints.
To address this, we propose Constraint-Aware Reinforcement Learning (CARL), a novel training-based approach that systematically strengthens LLMs’ intrinsic focus on constraints. CARL introduces a constraint-aware reward by comparing the model’s output distributions under constrained and unconstrained inputs, encouraging constraint focus and penalizing neglect. The method is compatible with various RL frameworks and requires no external tools or top models.
Extensive experiments on BlocksWorld, TravelPlanner, and T-Eval demonstrate that CARL outperforms standard RFT baselines and state-of-the-art reasoning models, and exhibits a markedly increased focus on constraint-related inputs.
Our work enables scalable, end-to-end constraint-aware planning in LLMs, marking a step toward more autonomous and compliant language agents. Code and data will be released afterwards.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 6989
Loading