Higher Order and Self-Referential Evolution for Population-based Methods

Samuel Coward; Chris Lu; Alistair Letcher; Minqi Jiang; Jack Parker-Holder; Jakob Nicolaus Foerster

Higher Order and Self-Referential Evolution for Population-based Methods

Samuel Coward, Chris Lu, Alistair Letcher, Minqi Jiang, Jack Parker-Holder, Jakob Nicolaus Foerster

Published: 17 Jun 2024, Last Modified: 01 Jul 2024AutoRL@ICML 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Genetic algorithms, evolutionary algorithms, population-based training, unsupervised environment design, reinforcement learning

TL;DR: We analyze higher-order self-referential evolution, apply it to population-based training and unsupervised environment design, and observe improved robustness to initial hyperparameters, progressing towards self-tuning, hyperparameter-free systems.

Abstract:

Due to their simplicity and support of high levels of parallelism, evolutionary algorithms have regained popularity in machine learning applications such as curriculum generation for Reinforcement Learning and online hyperparameter tuning. Yet, their performance can be brittle with respect to evolutionary hyperparameters, e.g. the mutation rate. To address this, self-adaptive mutation rates, i.e. mutation rates that also evolve, have previously been proposed. While this approach offers a partial solution, it still relies on an a priori set meta-mutation rates. Inspired by recent work, which demonstrates specific cases where evolution is able to implicitly optimize for higher-order meta-mutation rates, we investigate whether these higher-order mutations can make evolutionary algorithms more robust and improve their overall performance. We also analyse self-referential mutations, which mutate the final order meta-mutation parameter. Our results show that self-referential mutations improve robustness to initial hyperparameters in Population-based Training (PBT) for online hyperparameter tuning, and curriculum learning using Unsupervised Environment Design (UED). We also observe that self-referential mutations result in more complex adaptation in competitive multi-agent settings. Our research presents first steps towards robust fully self-tuning systems that are hyperparameter free.

Supplementary Material: zip

Submission Number: 5

Loading