The Effective Horizon Challenge

Cassidy Laidlaw; Daniel Khalil; Michelle Li; Laker Newhouse; Stuart Russell; Anca Dragan

The Effective Horizon Challenge

Cassidy Laidlaw, Daniel Khalil, Michelle Li, Laker Newhouse, Stuart Russell, Anca Dragan

Published: 12 Jun 2025, Last Modified: 10 Jul 2025EXAIT@ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Theory

Keywords: RL, benchmark, effective horizon, exploration, deep RL

Abstract: While benchmarks have driven significant progress in deep reinforcement learning (RL), they may be easier to solve than intended: recent work has found that many RL benchmark environments have a short *effective horizon*, a measure of complexity that captures how easy it is to explore and solve an environment via Monte Carlo lookahead search. We introduce a new benchmark, the Effective Horizon Challenge (EHC), which consists of environments with much longer effective horizons than those in past benchmarks. Although environments in the EHC have small state spaces, short episodes, shaped rewards, and deterministic transitions, we find that deep RL struggles to solve them. For example, PPO finds an optimal policy in only 8 of 43 environments in the EHC and DQN in only 12. Our results establish environments with long effective horizons as a new frontier for deep RL research, and the Effective Horizon Challenge provides a concrete way to make progress in this direction.

Serve As Reviewer: ~Cassidy_Laidlaw1

Submission Number: 68

Loading