LLM3: Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Shu Wang; Muzhi Han; Ziyuan Jiao; Zeyu Zhang; Ying Nian Wu; Song-Chun Zhu; Hangxin Liu

LLM3: Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Shu Wang, Muzhi Han, Ziyuan Jiao, Zeyu Zhang, Ying Nian Wu, Song-Chun Zhu, Hangxin Liu

Published: 18 Jun 2024, Last Modified: 05 Sept 2024MFM-EAI@ICML2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Motion planning

Abstract: Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in ad- dressing emerging tasks in real-world settings. Here, we present LLM3, a novel multi-modal foundation model TAMP framework featuring a domain-independent interface. Specifically, we leverage the powerful reasoning and planning capabilities of foundation models to propose symbolic action sequences and select continuous action parameters for motion planning. Through a series of simulations in a box-packing domain, we quantitatively demonstrate the effectiveness of our method. Ablation studies underscore the significant contribution of motion failure reasoning to the success of LLM3. Furthermore, we conduct qualitative experiments on a physical manipulator, demonstrating the practical applicability of our approach in real-world settings. Code is available: https://github.com/AssassinWS/LLM-TAMP.

Submission Number: 27

Loading