PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

Garrett Thomas; Ching-An Cheng; Ricky Loynd; Felipe Vieira Frujeri; Vibhav Vineet; Mihai Jalobeanu; Andrey Kolobov

PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

Garrett Thomas, Ching-An Cheng, Ricky Loynd, Felipe Vieira Frujeri, Vibhav Vineet, Mihai Jalobeanu, Andrey Kolobov

Published: 30 Aug 2023, Last Modified: 25 Oct 2023CoRL 2023 PosterReaders: Everyone

Keywords: Robot learning, Robotic manipulation, Visuomotor representations

TL;DR: A model architecture for robotic manipulation tailored to the realities of robotic manipulation datasets

Abstract: A rich representation is key to general robotic manipulation, but existing approaches to representation learning require large amounts of multimodal demonstrations. In this work we propose PLEX, a transformer-based architecture that learns from a small amount of task-agnostic visuomotor trajectories and a much larger amount of task-conditioned object manipulation videos — a type of data available in quantity. PLEX uses visuomotor trajectories to induce a latent feature space and to learn task-agnostic manipulation routines, while diverse video-only demonstrations teach PLEX how to plan in the induced latent feature space for a wide variety of tasks. Experiments showcase PLEX’s generalization on Meta-World and SOTA performance in challenging Robosuite environments. In particular, using relative positional encoding in PLEX’s transformers greatly helps in low-data regimes of learning from human-collected demonstrations.

Student First Author: yes

Supplementary Material: zip

Instructions: I have read the instructions for authors (https://corl2023.org/instructions-for-authors/)

Website: https://microsoft.github.io/PLEX

Publication Agreement: pdf

Poster Spotlight Video: mp4

11 Replies

Loading