Atari-5: Distilling the Arcade Learning Environment down to Five Games
Abstract: The Arcade Learning Environment (ALE) has become an essential benchmark for assessing the performance of reinforcement learning algorithms. However, the computational cost of generating results on the entire 57-game dataset limits ALE's use and makes the reproducibility of many results infeasible. We propose a novel solution to this problem in the form of a principled methodology for selecting small but representative subsets of environments within a benchmark suite. We applied our method to identify a subset of five ALE games, we call *Atari-5*, which produces 57-game median score estimates within 10% of their true values. Extending the subset to 10-games recovers 80% of the variance for log-scores for *all* games within the 57-game set. We show this level of compression is possible due to a high degree of correlation between many of the games in ALE.
Submission Number: 3520