Learning Compositional Behaviors from Demonstration and Language

Weiyu Liu; Neil Nie; Jiayuan Mao; Ruohan Zhang; Jiajun Wu

Learning Compositional Behaviors from Demonstration and Language

Weiyu Liu, Neil Nie, Jiayuan Mao, Ruohan Zhang, Jiajun Wu

Published: 24 Oct 2024, Last Modified: 06 Nov 2024LEAP 2024 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Manipulation, Planning Abstractions, Learning from Language

TL;DR: We introduce a new framework for long-horizon robotic manipulation by integrating imitation learning and planning with models automatically constructed from language.

Abstract:

We introduce Behavior from Language and Demonstration (BLADE), a framework for long-horizon robotic manipulation by integrating imitation learning and model-based planning. BLADE leverages language-annotated demonstrations, extracts abstract action knowledge from large language models (LLMs), and constructs a library of structured, high-level action representations. These representations include preconditions and effects grounded in visual perception for each high-level action, along with corresponding controllers implemented as neural network-based policies. BLADE can recover such structured representations automatically, without manually labeled states or symbolic definitions. BLADE shows significant capabilities in generalizing to novel situations, including novel initial states, external state perturbations, and novel goals. We validate the effectiveness of our approach both in simulation and on real robots with a diverse set of objects with articulated parts, partial observability, and geometric constraints.

Submission Number: 22

Loading