WestWorld: A Knowledge-Encoded Scalable Trajectory World Model for Diverse Robotics

Published: 02 Mar 2026, Last Modified: 05 Mar 2026ICLR 2026 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Trajectory World Model, Knowledge-Encoded Robotics Learning, Mixture-of-Experts
Abstract: Trajectory world models have emerged as a cornerstone of robotic dynamics learning, enabling more effective planning and control in complex environments. Recent studies have explored pre-training such models across diverse robotic systems, but they still face two major challenges: 1) scaling to a large number of heterogeneous robotic systems, and 2) failing to incorporate domain knowledge of robot morphology, which limits zero-shot generalization to previously unseen systems. To address these challenges, we introduce *WestWorld*, a kno**W**ledge-**E**ncoded **S**calable **T**rajectory **World** model for diverse robotics. To address the challenge of scalability, *WestWorld* uses a system-aware Mixture-of-Experts (Sys-MoE) that routes inputs to specialized experts via a learnable system embedding. To enhance zero-shot generalization, we incorporate domain knowledge of robot physical structure through a structural embedding that aligns trajectory representations with morphological information. After pretraining on 89 environments spanning diverse morphologies in both simulation and real-world settings, *WestWorld* significantly outperforms state-of-the-art baselines in zero-shot trajectory prediction. Notably, it demonstrates strong scalability as the number of robotic environments increases.
Submission Number: 24
Loading