MOTO: Offline to Online Fine-tuning for Model-Based Reinforcement Learning

Rafael Rafailov; Kyle Beltran Hatch; Victor Kolev; John D Martin; Mariano Phielipp; Chelsea Finn

MOTO: Offline to Online Fine-tuning for Model-Based Reinforcement Learning

Rafael Rafailov, Kyle Beltran Hatch, Victor Kolev, John D Martin, Mariano Phielipp, Chelsea Finn

Published: 03 Mar 2023, Last Modified: 12 Apr 2023RRL 2023 PosterReaders: Everyone

Keywords: model-based reinforcement learning, offline reinforcement learning, fine-tuning

TL;DR: We present a model-based RL algorithm, specifically designed for offline pre-training and online fine-tuning.

Abstract: We study the problem of offline-to-online reinforcement learning from high-dimensional pixel observations. While recent model-free approaches successfully use offline pre-training with online fine-tuning to either improve the performance of the data-collection policy or adapt to novel tasks, model-based approaches still remain underutilized in this setting. In this work, we argue that existing methods for high-dimensional model-based offline RL are not suitable for offline-to-online fine-tuning due to issues with representation learning shifts, off-dynamics data, and non-stationary rewards. We propose a simple on-policy model-based method with adaptive behavior regularization. In our simulation experiments, we find that our approach successfully solves long-horizon robot manipulation tasks completely from images by using a combination of offline data and online interactions.

Track: Technical Paper

Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.

1 Reply

Loading