Masked Trajectory Models for Prediction, Representation, and ControlDownload PDF

Published: 07 May 2023, Last Modified: 22 Oct 2023ICRA-23 Workshop on Pretraining4Robotics LightningReaders: Everyone
Keywords: offline RL, learning for control, sequence modeling
TL;DR: We present masked trajectory models (MTM) as a general self-supervised learning paradigm for RL. A single model trained with MTM can take on different roles/capabilities at inference time by simply "prompting" it with different masks.
Abstract: We introduce Masked Trajectory Models~(MTM) as a generic abstraction for sequential decision making. MTM takes a trajectory and aims to reconstruct the trajectory conditioned on random subsets of the same trajectory. By training with a highly randomized masking pattern, MTM learns versatile networks that can take on different roles or capabilities, by simply choosing appropriate masks at inference time. For example, the same MTM network can be used as a forward dynamics model, inverse dynamics model, or even an offline RL agent. Through extensive experiments in several continuous control tasks, we show that the same MTM network -- i.e. same weights -- can match or outperform specialized networks trained for the aforementioned capabilities. Additionally, we find that state representations learned by MTM can significantly accelerate the learning speed of traditional RL algorithms.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2305.02968/code)
0 Replies

Loading