Keywords: world model, attention, memory, imitation learning
TL;DR: We introduce TokenWM, a recurrent state-space model with tokenized latent states and memory-augmented attention to enhance world modeling in complex environments.
Abstract: World models are getting more and more popular in recent years. We introduce a new architecture -- TokenWM, that maintains the recurrent nature of state-space models while incorporating tokenized latent states and a memory-augmented attention mechanism to improve modeling capacity in complex environments. The preliminary results on LIBERO benchmarks demonstrate that the new architecture is more favorable to complex tasks than the popular RSSM architecture. We believe TokenWM introduces a new design paradigm for recurrent world models, enabling more expressive and scalable decision-making in complex environments.
Submission Number: 41
Loading