LHManip: A Dataset for Long-Horizon Language-Grounded Manipulation Tasks in Cluttered Tabletop Environments

Federico Ceola; Lorenzo Natale; Niko Suenderhauf; Krishan Rana

LHManip: A Dataset for Long-Horizon Language-Grounded Manipulation Tasks in Cluttered Tabletop Environments

Federico Ceola, Lorenzo Natale, Niko Suenderhauf, Krishan Rana

Published: 26 Jun 2024, Last Modified: 09 Jul 2024DGR@RSS2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Long-Horizon Robotic Manipulation, Real Robot Dataset, Language Guided Tasks

TL;DR: We present LHManip: a dataset for long-horizon language-grounded manipulation tasks in cluttered tabletop environments collected via teleoperation.

Abstract: Instructing a robot to complete an everyday task within our homes has been a long-standing challenge for robotics. While recent progress in language-conditioned imitation learning and offline reinforcement learning has demonstrated impressive performance across a wide range of tasks, they are typically limited to short-horizon tasks -- not reflective of those a home robot would be expected to complete. While existing architectures have the potential to learn these desired behaviours, the lack of the necessary long-horizon, multi-step datasets for real robotic systems poses a significant challenge. To this end, we present the Long-Horizon Manipulation (LHManip) dataset comprising 200 episodes, demonstrating 20 different manipulation tasks via real robot teleoperation. The tasks entail multiple sub-tasks, including grasping, pushing, stacking and throwing objects in highly cluttered environments. Each task is paired with a natural language instruction and multi-camera viewpoints for point-cloud or NeRF reconstruction. In total, the dataset comprises 176,278 observation-action pairs which form part of the Open X-Embodiment dataset. The full LHManip dataset is made publicly available here.

Submission Number: 3

Loading