Performance Prediction In Reinforcement Learning: The Bad And The Ugly

Julian Dierkes; Theresa Eimer; Marius Lindauer; Holger Hoos

Performance Prediction In Reinforcement Learning: The Bad And The Ugly

Julian Dierkes, Theresa Eimer, Marius Lindauer, Holger Hoos

Published: 17 Jul 2025, Last Modified: 07 Oct 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: AutoRL, performance prediction, hyperparameters

TL;DR: RL performance with regard to hyperparameters is hard to predict, noisy and multi-modal. Surrogate benchmarks and interpretability seem out of reach with current methods.

Abstract: Reinforcement learning (RL) methods are known to be highly sensitive to their hyperparameter settings and costly to evaluate. In light of this, surrogate models that predict the performance of a given algorithm given a hyperparameter configuration seem an attractive solution for understanding and optimising computationally expensive tasks.In this work, we are studying such surrogates for RL and find that RL methods present a significant challenge to current performance prediction approaches. Specifically, RL landscapes appear to be rugged and noisy, which is reflected in the poor performance of surrogate models. Even if surrogate models are only used for gaining insights into the hyperparameter landscapes and not as replacements for algorithm evaluations in benchmarking, we find that they deviate from the ground truth significantly. Our evaluation highlights the limits of surrogate modelling for RL and cautions against blindly trusting surrogate-based methods for this domain. This calls for more sophisticated solutions for effectively using surrogate models in sequential model-based optimisation of RL hyperparameters.

Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.

Serve As Reviewer: ~Theresa_Eimer2, ~Julian_Dierkes1

Track: Regular Track: unpublished work

Submission Number: 85

Loading