Evolving Reinforcement Learning Algorithms

John D Co-Reyes; Yingjie Miao; Daiyi Peng; Esteban Real; Quoc V Le; Sergey Levine; Honglak Lee; Aleksandra Faust

Evolving Reinforcement Learning Algorithms

John D Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Quoc V Le, Sergey Levine, Honglak Lee, Aleksandra Faust

Published: 12 Jan 2021, Last Modified: 22 Jun 2025ICLR 2021 OralReaders: Everyone

Keywords: reinforcement learning, evolutionary algorithms, meta-learning, genetic programming

Abstract: We propose a method for meta-learning reinforcement learning algorithms by searching over the space of computational graphs which compute the loss function for a value-based model-free RL agent to optimize. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. Our method can both learn from scratch and bootstrap off known existing algorithms, like DQN, enabling interpretable modifications which improve performance. Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm. Bootstrapped from DQN, we highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games. The analysis of the learned algorithm behavior shows resemblance to recently proposed RL algorithms that address overestimation in value-based methods.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We meta-learn RL algorithms by evolving computational graphs which compute the loss function for a value-based model-free RL agent to optimize.

Supplementary Material: zip

Code: [![github](/images/github_icon.svg) google/brain_autorl](https://github.com/google/brain_autorl/tree/main/evolving_rl) + [![Papers with Code](/images/pwc_icon.svg) 4 community implementations](https://paperswithcode.com/paper/?openreview=0XXpJ4OtjW)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/evolving-reinforcement-learning-algorithms/code)

8 Replies

Loading