Keywords: Reinforcement Learning, Representation Learning, Planning
TL;DR: A framework that leverages diverse offline data for learning representations, goal-conditioned policies, and affordance models that enable rapid fine-tuning to new tasks in target scenes.
Abstract: The use of broad datasets has proven to be crucial for generalization for a wide range of fields. However, how to effectively make use of diverse multi-task data for novel downstream tasks still remains a grand challenge in reinforcement learning and robotics. To tackle this challenge, we introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data, in combination with online fine-tuning guided by subgoals in a learned lossy representation space. When faced with a novel task goal, our framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems. Learned from the broad prior data, the lossy representation emphasizes task-relevant information about states and goals while abstracting away redundant contexts that hinder generalization. It thus enables subgoal planning for unseen tasks, provides a compact input to the policy, and facilitates reward shaping during fine-tuning. We show that our framework can be pre-trained on large-scale datasets of robot experience from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.
Student First Author: no
Supplementary Material: zip