Keywords: Continual Reinforcement Learning, Model-free Reinforcement Learning, Hypernetworks, Proximal Policy Optimization, Robotics, Multi-task Reinforcement Learning
TL;DR: Hypernetwork-PPO is a novel continual reinforcement learning method with strong protection against catastrophic forgetting and comparable single-task performance to PPO.
Abstract: Continually learning new capabilities in different environments, and being able
to solve multiple complex tasks is of great importance for many robotics appli-
cations. Modern reinforcement learning algorithms such as Proximal Policy Op-
timization can successfully handle surprisingly difficult tasks, but are generally
not suited for multi-task or continual learning. Hypernetworks are a promising
approach for avoiding catastrophic forgetting, and have previously been used suc-
cessfully for continual model-learning in model-based RL. We propose HN-PPO,
a continual model-free RL method employing a hypernetwork to learn multiple
policies in a continual manner using PPO. We demonstrate our method on Door-
Gym, and show that it is suitable for solving tasks involving complex dynamics
such as door opening, while effectively protecting against catastrophic forgetting
0 Replies
Loading