- Keywords: Hierarchical reinforcement learning, evolution strategies, reinforcement learning
- Abstract: This paper investigates the performance of Scalable Evolution Strategies (S-ES) as a Hierarchical Reinforcement Learning (HRL) approach. S-ES, named for its excellent scalability across many processors, was popularised by OpenAI when they showed its performance to be comparable to the state-of-the-art policy gradient methods. However, to date, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. In this work, we introduce a novel method that merges S-ES and HRL, which allows S-ES to be applied to difficult problems such as simultaneous robot locomotion and navigation. We show that S-ES needed no (methodological or hyperparameter) modifications for it to be used in a hierarchical context and that its indifference to delayed rewards leads to it having competitive performance with state-of-the-art gradient-based HRL methods. This leads to a novel HRL method that achieves state-of-the-art performance, and is also comparably simple and highly scalable.
- One-sentence Summary: We introduce a novel method that merges evolution strategies and hierarchical reinforcement learning and show that evolution strategies can acheive state-of-the-art performance on challenging problems
- Supplementary Material: zip