Keywords: Reinforcement Learning, Empirical Science
Abstract: This paper introduces a new benchmark, the Cross-environment Hyperparameter
Setting Benchmark, that allows comparison of RL algorithms across environments
using only a single hyperparameter setting, encouraging algorithmic development
which is insensitive to hyperparameters. We demonstrate that the benchmark is
robust to statistical noise and obtains qualitatively similar results across repeated
applications, even when using a small number of samples. This robustness makes
the benchmark computationally cheap to apply, allowing statistically sound insights
at low cost. We provide two example instantiations of the CHS, on a set of six
small control environments (SC-CHS) and on the entire DM Control suite of 28
environments (DMC-CHS). Finally, to demonstrate the applicability of the CHS to
modern RL algorithms on challenging environments, we provide a novel empirical
study of an open question in the continuous control literature. We show, with
high confidence, that there is no meaningful difference in performance between
Ornstein-Uhlenbeck noise and uncorrelated Gaussian noise for exploration with
the DDPG algorithm on the DMC-CHS.
Submission Number: 330
Loading