FLARE: Robot Learning with Implicit World Modeling

Ruijie Zheng; Jing Wang; Scott Reed; Johan Bjorck; Yu Fang; Fengyuan Hu; Joel Jang; Kaushil Kundalia; Zongyu Lin; Loïc Magne; Avnish Narayan; You Liang Tan; Guanzhi Wang; Qi Wang; Jiannan Xiang; Yinzhen Xu; Seonghyeon Ye; Jan Kautz; Furong Huang; Yuke Zhu; Linxi Fan

FLARE: Robot Learning with Implicit World Modeling

Published: 08 Aug 2025, Last Modified: 16 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: World Model, VLA, Humanoid Robotics

TL;DR: We propose FLARE, a conceptually simple and lightweight framework for joint robot policy learning and latent world modeling.

Abstract: We introduce **F**uture **LA**tent **R**presentation Alignm**E**nt (**FLARE**), a novel framework that integrates predictive world modeling into robot policy learning. By aligning features from a diffusion transformer with latent embeddings of future observations, **FLARE** enables a diffusion transformer policy to anticipate latent representations of future observations, allowing it to reason about long-term consequences while generating actions. Remarkably lightweight, **FLARE** requires only minimal architectural modifications---adding a few tokens to standard vision-language-action (VLA) models---yet delivers substantial performance gains. Across two challenging multitask simulation imitation learning benchmarks spanning single-arm and humanoid tabletop manipulation, **FLARE** achieves state-of-the-art performance, outperforming prior policy learning baselines by up to 26\%. Moreover, **FLARE** unlocks the ability to co-train with human egocentric video demonstrations lacking action labels, significantly boosting policy generalization to a novel object with unseen geometry with as few as 1 robot demonstration. Our results establish **FLARE** as a general and scalable approach for combining implicit world modeling with high-frequency robotic control.

Supplementary Material: zip

Submission Number: 951

Loading

FLARE: Robot Learning with Implicit World Modeling

Ruijie Zheng, Jing Wang, Scott Reed, Johan Bjorck, Yu Fang, Fengyuan Hu, Joel Jang, Kaushil Kundalia, Zongyu Lin, Loïc Magne, Avnish Narayan, You Liang Tan, Guanzhi Wang, Qi Wang, Jiannan Xiang, Yinzhen Xu, Seonghyeon Ye, Jan Kautz, Furong Huang, Yuke Zhu et al. (1 additional authors not shown)