Abstract: Reinforcement learning (RL) methods are known to be highly sensitive to their hyperparameter settings and costly to evaluate. In light of this, surrogate models that predict the performance of a given algorithm given a hyperparameter configuration seem an attractive solution for understanding and optimising these computationally expensive tasks. In this work, we are studying such surrogates for RL and find that RL methods present a significant challenge to current performance prediction approaches. Specifically, RL landscapes appear to be rugged and noisy, which leads to the poor quality of surrogate models. Even if surrogate models are only used for gaining insights into the hyperparameter landscapes and not as replacements for algorithm evaluations in benchmarking, we find that they deviate substantially from the ground truth. While our evaluation highlights the limits of surrogate modelling for RL, we propose a method for automatically reducing configuration spaces for improved surrogate performance. We also derive recommendations for RL practitioners that caution against blindly trusting surrogate-based methods for this domain and highlight where and how they can be used.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Adam_M_White1
Submission Number: 7665
Loading