Self-Supervised 3D Representation Learning for RoboticsDownload PDF

Published: 07 May 2023, Last Modified: 16 May 2023ICRA-23 Workshop on Pretraining4Robotics LightningReaders: Everyone
Keywords: 3D representation learning, robot learning, imitation learning, masked reconstruction
TL;DR: A self-supervised 3D representation learning framework for robotic manipulation tasks
Abstract: Recent work on visual representation learning from images and videos has shown to be efficient for robotic manipulation tasks. However, learning to act in a 6-DoF 3D action space from 2D observations is a hard problem. As a result, 2D representation learning methods require huge amounts of data for pretraining. To this end, we investigate a self-supervised 3D representation learning framework that works with limited data. Our model learns 3D scene representations from self-supervised masked reconstruction of 3D voxel grids, alongside imitation learning, from few-shot task demonstrations. We use Perceiver-Actor as the backbone for 3D representation learning. Our preliminary experiments show improved task success rates on the training task and its visual variations compared to the base Perceiver-Actor.
0 Replies

Loading