Generalization with a SPARC: Single-Phase Adaptation for Reinforcement Learning in Contextual Environments
Keywords: reinforcement learning, generalization, autonomous racing
TL;DR: We introduce SPARC, a single-phase context-adaptive RL algorithm which delivers state-of-the-art OOD generalization on MuJoCo and Gran Turismo 7.
Abstract: Generalization to unseen environments is a significant challenge in the field of robotics and control. In this work, we focus on contextual reinforcement learning, where the agent acts within environments with varying contexts, such as self-driving cars or quadrupedal robots that need to operate in different terrains or weather conditions than they were trained for. We tackle the critical task of generalizing to out-of-distribution (OOD) contexts, without access to explicit context information at test time. Recent work has addressed this problem by training a context encoder and a history adaptation module in separate stages. While promising, this two-phase approach is cumbersome to implement and train. We simplify the methodology and introduce SPARC, a single-phase adaptation method for reinforcement learning in contextual environments. We evaluate SPARC on varying contexts within MuJoCo environments and the high-fidelity racing simulator Gran Turismo 7 and find that it achieves competitive or superior performance on OOD generalization.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Bram_Grooten1
Track: Regular Track: unpublished work
Submission Number: 46
Loading