Recurrent world model with tokenized latent states

Published: 06 Mar 2025, Last Modified: 15 Apr 2025ICLR 2025 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: world model, attention, memory, imitation learning
TL;DR: We introduce TokenWM, a recurrent state-space model with tokenized latent states and memory-augmented attention to enhance world modeling in complex environments.
Abstract: World models are getting more and more popular in recent years. We introduce a new architecture -- TokenWM, that maintains the recurrent nature of state-space models while incorporating tokenized latent states and a memory-augmented attention mechanism to improve modeling capacity in complex environments. The preliminary results on LIBERO benchmarks demonstrate that the new architecture is more favorable to complex tasks than the popular RSSM architecture. We believe TokenWM introduces a new design paradigm for recurrent world models, enabling more expressive and scalable decision-making in complex environments.
Submission Number: 41
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview