Efficient Robotic Policy Learning via Latent Space Backward Planning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0
Abstract: Current robotic planning methods often rely on predicting multi-frame images with full pixel details. While this fine-grained approach can serve as a generic world model, it introduces two significant challenges for downstream policy learning: substantial computational costs that hinder real-time deployment, and accumulated inaccuracies that can mislead action extraction. Planning with coarse-grained subgoals partially alleviates efficiency issues. However, their forward planning schemes can still result in off-task predictions due to accumulation errors, leading to misalignment with long-term goals. This raises a critical question: Can robotic planning be both efficient and accurate enough for real-time control in long-horizon, multi-stage tasks? To address this, we propose a **B**ackward **P**lanning scheme in **L**atent space (**LBP**), which begins by grounding the task into final latent goals, followed by recursively predicting intermediate subgoals closer to the current state. The grounded final goal enables backward subgoal planning to always remain aware of task completion, facilitating on-task prediction along the entire planning horizon. The subgoal-conditioned policy incorporates a learnable token to summarize the subgoal sequences and determines how each subgoal guides action extraction. Through extensive simulation and real-robot long-horizon experiments, we show that LBP outperforms existing fine-grained and forward planning methods, achieving SOTA performance. Project Page: [https://lbp-authors.github.io](https://lbp-authors.github.io).
Lay Summary: Modern robots often try to imagine future scenes as a way to plan their actions. However, current methods are often slow and prone to inaccurate predictions, which can lead robots away from their intended goals. To solve this, we introduce a new planning strategy called **Latent Space Backward Planning (LBP)**. Instead of planning forward from the present, LBP starts from the final goal and works backward, efficiently setting meaningful checkpoints along the way. This strategy helps the robot stay on track and speeds up the planning process. We test this method in both simulations and with real robots on complex tasks, and it perform better than existing methods—making it a promising step towards more efficient and reliable robot control in the real world.
Link To Code: https://lbp-authors.github.io
Primary Area: Applications->Robotics
Keywords: Planning, Embodied Agents, Goal-Conditioned Policy
Submission Number: 10097
Loading