Keywords: Model-based Reinforcement Learning, State-Space Models, Sequence Modeling
TL;DR: We make model-based RL faster by using state-space models for sequence modeling in the world model. We showcase the results on a real robot.
Abstract: Model-based RL (MBRL) simultaneously learns a policy and a world model that
captures the environment’s dynamics and rewards. The world model can either
be used for planning, for data collection, or to provide first-order policy gradients
for training. Leveraging a world model significantly improves sample efficiency
compared to model-free RL. However, training a world model alongside the pol-
icy increases the computational complexity, leading to longer training times that
are often intractable for complex real-world scenarios. In this work, we propose
a new method for accelerating model-based RL using state-space world models.
Our approach leverages state-space models (SSMs) to parallelize the training of
the dynamics model, which is typically the main computational bottleneck. Ad-
ditionally, we propose an architecture that provides privileged information to the
world model during training, which is particularly relevant for partially observable
environments. We evaluate our method in several real-world agile quadrotor flight
tasks, involving complex dynamics, for both fully and partially observable envi-
ronments. We demonstrate a significant speedup, reducing the world model train-
ing time by up to 10 times, and the overall MBRL training time by up to 4 times.
This benefit comes without compromising performance, as our method achieves
similar sample efficiency and task rewards to state-of-the-art MBRL methods.
Submission Number: 25
Loading