Keywords: World Models, Long-horizon planning, Contact-rich Manipulation
TL;DR: We present HDFlow, a novel hierarchical planner that combines a diffusion model for high-level exploration and a rectified flow model for low-level trajectory generation, and demonstrate the efficiency on four contact-rich furniture assembly tasks.
Abstract: Long-horizon manipulation tasks represent a significant challenge in robotics, demanding both strategic, high-level reasoning and fast, precise, low-level control. While recent advances in generative models have shown promise in generating behavior plans for long-horizon tasks, they often lack a principled framework for hierarchical decomposition and struggle with the computational demands of real-time execution, due to their iterative denoising process. In this work, we introduce Hierarchical Diffusion-Flow ($\texttt{HDFlow}$), a novel hierarchical planning framework that optimally leverages the strengths of $\textit{diffusion}$ and $\textit{rectified flow}$ models. $\texttt{HDFlow}$ employs a high-level diffusion planner to generate sequences of strategic subgoals in a learned latent space, capitalizing on diffusion's powerful exploratory capabilities. These subgoals then guide a low-level rectified flow planner that generates smooth and dense trajectories, exploiting the speed and efficiency of ordinary differential equation (ODE)-based trajectory generation. This hybrid approach synergistically combines the strengths of both models to overcome the limitations of single-paradigm generative planners, enabling robust and efficient long-horizon planning. We evaluate $\texttt{HDFlow}$ on four challenging furniture assembly tasks, where it significantly outperforms state-of-the-art methods. Our work demonstrates that a hybrid generative planner provides a powerful solution for long-horizon robotic assembly. Project website: https://hdflow-page.github.io
Submission Number: 66
Loading