Keywords: Reinforcement learning, Curriculum learning, Risk-aware decision-making, Heavy-tailed distributions
TL;DR: We propose a risk-aware curriculum generation algorithm that, given a heavy-tailed distribution over target tasks, generates two curricula: one to maximize the expected discounted return, and another to identify and over-sample rare and risky tasks.
Abstract: Automated curriculum generation for reinforcement learning (RL) aims to speed up learning by designing a sequence of tasks of increasing difficulty. Such tasks are usually drawn from probability distributions with exponentially bounded tails, such as uniform or Gaussian distributions. However, existing approaches overlook heavy-tailed distributions. Under such distributions, current methods may fail to learn optimal policies in rare and risky tasks, which fall under the tails and yield the lowest returns, respectively. We address this challenge by proposing a risk-aware curriculum generation algorithm that simultaneously creates two curricula: 1) a primary curriculum that aims to maximize the expected discounted return with respect to a distribution over target tasks, and an auxiliary curriculum that identifies and over-samples rare and risky tasks observed in the primary curriculum. Our empirical results evidence that the proposed algorithm achieves significantly higher returns in frequent as well as rare tasks compared to the state-of-the-art methods.
Supplementary Material: pdf
Other Supplementary Material: zip