Trajectory World Models for Heterogeneous Environments

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We introduce UniTraj and TrajWorld, a unified dataset and flexible architecture to enable positive transfer when pre-training world models across heterogeneous environments.
Abstract: Heterogeneity in sensors and actuators across environments poses a significant challenge to building large-scale pre-trained world models on top of this low-dimensional sensor information. In this work, we explore pre-training world models for heterogeneous environments by addressing key transfer barriers in both data diversity and model flexibility. We introduce UniTraj, a unified dataset comprising over one million trajectories from 80 environments, designed to scale data while preserving critical diversity. Additionally, we propose TrajWorld, a novel architecture capable of flexibly handling varying sensor and actuator information and capturing environment dynamics in-context. Pre-training TrajWorld on UniTraj yields substantial gains in transition prediction, achieves a new state-of-the-art for off-policy evaluation, and also delivers superior online performance of model predictive control. To the best of our knowledge, this work, for the first time, demonstrates the transfer benefits of world models across heterogeneous and complex control environments. Code and data are available at https://github.com/thuml/TrajWorld.
Lay Summary: Robots and control systems often use machine learning models called world models to predict how their actions will affect the environment. However, these models are usually trained in narrow settings with fixed sensor and actuator configurations. In real-world applications, different systems can vary widely in how many sensors or actuators they have, what each one represents, and how they are physically arranged or controlled. This structural heterogeneity makes it hard to train general-purpose models. To address this, we introduce UniTraj, a large dataset with over one million trajectories from 80 diverse control environments. It is designed to scale pre-training while preserving the diversity seen in real systems. We also propose TrajWorld, a model architecture that flexibly adapts to different sensor and actuator structures. TrajWorld learns to infer the underlying dynamics within the context of observed trajectories, allowing it to understand the environment by considering recent history and interactions. This work helps build general world models capable of transferring knowledge across diverse robots and environments, paving the way for multimodal world models that integrate vision and sensor data for a deeper understanding of the physical world.
Link To Code: https://github.com/thuml/TrajWorld
Primary Area: Reinforcement Learning->Deep RL
Keywords: world models, pre-training, heterogeneous environments
Submission Number: 1215
Loading