Keywords: Parameter-efficient fine-tuning, Modular Reinforcement Learning, Few-shot imitation learning
TL;DR: We present a two-stage scalable framework for robot learning: task-agnostic transformer policy pretraining followed by parameter-efficient few-shot downstream task adaption using adapters.
Abstract: Large transformer-based architectures are capable of complex robot task planning and low-level control. In the natural language processing (NLP) community, fine-tuning large pretrained models (PTMs) such as GPT-3 and PaLM is the de-facto standard. With the scalability of transformer models and growing availability of large-scale multimodal robot data, we investigate pretraining large backbone models to capture useful behavioral priors that enable efficient few-shot transfer to downstream robot tasks. We explore the setting of modular reinforcement learning (RL) in which each downstream task is encapsulated by an independently learned module. With many downstream tasks, fine-tuning or training separate copies of these large PTMs become computationally and memory intensive. We propose to pretrain a large transformer backbone on task-agnostic data and learn small task-specific adapters using few-shot imitation learning to quickly adapt to downstream tasks. We evaluate on complex robot manipulation tasks in the Metaworld environment and demonstrate that adapter training is a parameter-efficient approach for modular RL.
1 Reply
Loading