Keywords: Evolution strategies, Transformers, Policy optimization, Reinforcement learning
TL;DR: Exploring whether the evolution strategies are able to train transformer-based agents in reinforcement learning.
Abstract: We explore the capability of evolution strategies to train an agent with a policy based on a transformer architecture in a reinforcement learning setting. We performed experiments using OpenAI’s highly parallelizable evolution strategy to train Decision Transformer in the MuJoCo Humanoid locomotion environment and in the environment of Atari games, testing the ability of this black-box optimization technique to train even such relatively large and complicated models (compared to those previously tested in the literature). The examined evolution strategy proved to be, in general, capable of achieving strong results and managed to produce high-performing agents, showcasing evolution’s ability to tackle the training of even such complex models.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 20356
Loading