DreamGen: Unlocking Generalization in Robot Learning through Video World Models

Joel Jang; Seonghyeon Ye; Zongyu Lin; Jiannan Xiang; Johan Bjorck; Yu Fang; Fengyuan Hu; Spencer Huang; Kaushil Kundalia; Yen-Chen Lin; Loïc Magne; Ajay Mandlekar; Avnish Narayan; You Liang Tan; Guanzhi Wang; Jing Wang; Qi Wang; Yinzhen Xu; Xiaohui Zeng; Kaiyuan Zheng; Ruijie Zheng; Ming-Yu Liu; Luke Zettlemoyer; Dieter Fox; Jan Kautz; Scott Reed; Yuke Zhu; Linxi Fan

DreamGen: Unlocking Generalization in Robot Learning through Video World Models

Published: 08 Aug 2025, Last Modified: 16 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Video World Models, Synthetic Data, Behavior Generalization, Environment Generalization

TL;DR: We propose neural trajectories, a new pipeline that enables augmenting robot training data, enabling robots to perform totally new actions in unseen environments.

Abstract: In this work, we unlock new capabilities in robot learning from neural trajectories, synthetic robot data generated from video world models. Our proposed recipe is simple, but powerful: we take the most recent state-of-the-art video generative models (world models), adapt them to the target robot embodiment, and generate new, synthetic robot data of the same task or even new behaviors. Since these video world models only generate videos, we explore two techniques of getting robot actions: extracting latent actions from a general-purpose latent action model and getting predicted actions from an inverse-dynamics model (IDM), giving flexibility across diverse scenarios. Our proposed approach unlocks behavior and environment generalization, allowing a humanoid robot to perform 20+ new behaviors in unseen environments while only collecting teleoperation data for pick and place in a single environment. By introducing a new world modeling benchmark, we demonstrate that stronger video world models directly correlate with improved downstream robot policy performance. This establishes a new scaling dimension beyond simply collecting additional teleoperation data, changing how we approach robot learning.

Supplementary Material: zip

Submission Number: 836

Loading