Parseval Regularization for Continual Reinforcement Learning

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning, Continual Learning, Plasticity, Optimization
TL;DR: Maintaining orthogonal weight matrices during training improves continual reinforcement learning agents.
Abstract: Plasticity loss, trainability loss, and primacy bias have been identified as issues arising when training deep neural networks on sequences of tasks---referring to the increased difficulty in training on new tasks. We propose to use Parseval regularization, which maintains orthogonality of weight matrices, to preserve useful optimization properties and improve training in a continual reinforcement learning setting. We show that it provides significant benefits to RL agents on a suite of gridworld, CARL and MetaWorld tasks. We conduct comprehensive ablations to identify the source of its benefits and investigate the effect of certain metrics associated to network trainability including weight matrix rank, weight norms and policy entropy.
Primary Area: Reinforcement learning
Submission Number: 17753
Loading