Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning

Yutong Wang; Ke Xue; Chao Qian

Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning

Yutong Wang, Ke Xue, Chao Qian

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 PosterReaders: Everyone

Keywords: Reinforcement learning, Quality-Diversity, Evolutionary algorithms

Abstract: Reinforcement Learning (RL) has achieved significant successes, which aims to obtain a single policy maximizing the expected cumulative rewards for a given task. However, in many real-world scenarios, e.g., navigating in complex environments and controlling robots, one may need to find a set of policies having both high rewards and diverse behaviors, which can bring better exploration and robust few-shot adaptation. Recently, some methods have been developed by using evolutionary techniques, including iterative reproduction and selection of policies. However, due to the inefficient selection mechanisms, these methods cannot fully guarantee both high quality and diversity. In this paper, we propose EDO-CS, a new Evolutionary Diversity Optimization algorithm with Clustering-based Selection. In each iteration, the policies are divided into several clusters based on their behaviors, and a high-quality policy is selected from each cluster for reproduction. EDO-CS also adaptively balances the importance between quality and diversity in the reproduction process. Experiments on various (i.e., deceptive and multi-modal) continuous control tasks, show the superior performance of EDO-CS over previous methods, i.e., EDO-CS can achieve a set of policies with both high quality and diversity efficiently while previous methods cannot.

One-sentence Summary: We propose EDO-CS, a new Evolutionary Diversity Optimization algorithm with Clustering-based Selection that can achieve a set of policies with both high quality and diversity efficiently.

19 Replies

Loading