ThEBES: Thorough Energy-Based Evolution Strategy

21 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: evolution strategies, reinforcement learning, energy-based models
TL;DR: A novel evolution strategy to evolve agents robust to observational noise
Abstract: Recently, Evolution Strategies (ESs) have achieved state-of-the-art results: ESs are a family of evolutionary algorithms that iteratively update the parameters of a search distribution to sample solutions to be evaluated. By optimizing a population, ESs promise to evolve solutions that are robust. Nevertheless, current methods have yet to deliver on this promise. We include an explicit drive towards robustness by applying noise to the search distribution mean after evaluating the solutions, adding a stochastic drift to the ES search trajectory. We mathematically ground our algorithm on Energy-Based Models (EBMs) and interpret it as performing Langevin dynamics on the search space, thus converging to a probability distribution and not a point estimate for the search distribution parameters. So we introduce ThEBES, the Thorough Energy-Based Evolution Strategy. We compare ThEBES against state-of-the-art ESs on continuous policy search tasks. Our results show that ThEBES is competitive in terms of effectiveness. We also find that, by virtue of its stochastic dynamics, ThEBES evolves policies that are more robust to observational noise. We thus believe our work to be a promising avenue for future research and to strengthen the theoretical backings of ESs, since it provides a solid mathematical ground to ESs in the context of energy-based models.
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4135
Loading