Efficient long-horizon planning and learning for locomotion and object manipulation

Published: 24 Oct 2024, Last Modified: 06 Nov 2024LEAP 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: contact-rich manipulation and locomotion, Monte-Carlo tree search, Diffusion models, multi-modal imitation learning
TL;DR: Our approach uses Monte-Carlo tree search (MCTS) to efficiently search over discrete contact sequence and structure-exploiting gradient-based trajectory optimization for checking the feasibility of the candidate contact plans.
Abstract: Locomotion and manipulation are difficult problems in robotics, as they involve a long-horizon decision-making problem that involves a combination of discrete and continuous decision variables. While simple end-to-end imitation and reinforcement learning have shown promise in the past few years, they generally struggle with problems that need reasoning over a long horizon, e.g., stacking objects and locomotion over highly constrained environments. In this paper, we propose a structured approach to learning long-horizon locomotion and manipulation problems. Our approach uses Monte-Carlo tree search (MCTS) to efficiently search over discrete decision variables (e.g., whcih surface to contact next) and structure-exploiting gradient-based trajectory optimization for checking the feasibility of the candidate contact plans. Since the whole process is still time-cnnsuming and cannot be done for real-time control, we propose to leverage imitation learning (in particular diffusion models) to learn a policy that can reactively generate new feasible contact sequences. We tested our whole pipeline on quadrupedal locomotion on stepping stones and fine object manipulation and show that this framework can reach real-time rates.
Submission Number: 28
Loading