Real2Gen: Imitation Learning from a Single Human Demonstration with Generative Foundational Models

Nick Heppert; Minh Quang Nguyen; Abhinav Valada

Real2Gen: Imitation Learning from a Single Human Demonstration with Generative Foundational Models

Nick Heppert, Minh Quang Nguyen, Abhinav Valada

Published: 18 Apr 2025, Last Modified: 08 May 2025ICRA 2025 FMNS PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Imitation Learning, Human Demonstrations, 3D Generative Foundational Model, Procedural Simulation

TL;DR: Parsing a single human demonstration to generate simulatable assets for creating an imitation learning dataset.

Abstract: Imitation learning is a common paradigm for teaching robots new tasks. However, collecting robot demonstrations through teleoperation or kinesthetic teaching can be tedious and time-consuming, slowing down training data collection for policy learning. On the other hand, while transfer to the robot can be non-trivial, directly demonstrating a task using our human embodiment is much easier, and data is available in abundance. In this work, we propose Real2Gen to train a manipulation policy from a single human demonstration. Real2Gen extracts required information from the demonstration, transfers it to a simulation environment, where a programmable expert agent can demonstrate the task arbitrarily many times, generating an unlimited amount of data to train a flow matching policy. We evaluate Real2Gen on human demonstrations from three different real-world tasks and compare it to a recent baseline. Real2Gen shows an average increase in the success rate of 26.6% and better generalization of the trained policy due to the abundance and diversity of training data. We make the data, code, and trained models publicly available at real2gen.cs.uni-freiburg.de.

Supplementary Material: pdf

Submission Number: 16

Loading