Object-Centric Learning of Neural Policies for Zero-shot Transfer over Domains with Varying Quantities of Interest
Keywords: Object-centric Reinforcement Learning, Zero-shot transfer
TL;DR: We propose an object-centric reinforcement learning agent for zero-shot transfer across environments.
Abstract: Our goal is to learn policies that generalize across variation in quantities of interest in the domain (e.g., number of objects, motion dynamics, distance to the goal) in a zero shot manner. Recent work on object-centric approaches for image and video processing has shown significant promise in building models that generalize well to unseen settings. In this work, we present Object Centric Reinforcement Learning Agent (ORLA), the first object-centric approach for model-free RL in perceptual domains. ORLA works in three phases: first, it learns to extract a variable number of objects masks, via an expert trained using encoder-decoder architecture, which in turn generates data for fine-tuning a YOLO based model for extracting bounding boxes in unseen settings. Second, bounding boxes are used to construct a symbolic state consisting of object positions across a sequence of frames. Finally, a GAT based architecture is employed over extracted object positions to learn a dense state embedding, which is then decoded to get the final policy that generalizes to unseen environments. Extensive experimentation over a number of domains shows that ORLA can learn significantly better policies that transfer across variations in different quantities of interest compared to existing baselines, which often fail to do any meaningful transfer.
Submission Number: 18