Hypernetwork-PPO for Continual Reinforcement LearningDownload PDF

08 Oct 2022, 17:47 (modified: 09 Dec 2022, 14:31)Deep RL Workshop 2022Readers: Everyone
Keywords: Continual Reinforcement Learning, Model-free Reinforcement Learning, Hypernetworks, Proximal Policy Optimization, Robotics, Multi-task Reinforcement Learning
TL;DR: Hypernetwork-PPO is a novel continual reinforcement learning method with strong protection against catastrophic forgetting and comparable single-task performance to PPO.
Abstract: Continually learning new capabilities in different environments, and being able to solve multiple complex tasks is of great importance for many robotics appli- cations. Modern reinforcement learning algorithms such as Proximal Policy Op- timization can successfully handle surprisingly difficult tasks, but are generally not suited for multi-task or continual learning. Hypernetworks are a promising approach for avoiding catastrophic forgetting, and have previously been used suc- cessfully for continual model-learning in model-based RL. We propose HN-PPO, a continual model-free RL method employing a hypernetwork to learn multiple policies in a continual manner using PPO. We demonstrate our method on Door- Gym, and show that it is suitable for solving tasks involving complex dynamics such as door opening, while effectively protecting against catastrophic forgetting
0 Replies