Assessing Generalization in Deep Reinforcement LearningDownload PDF

27 Sept 2018 (modified: 14 Oct 2024)ICLR 2019 Conference Blind SubmissionReaders: Everyone
Abstract: Deep reinforcement learning (RL) has achieved breakthrough results on many tasks, but has been shown to be sensitive to system changes at test time. As a result, building deep RL agents that generalize has become an active research area. Our aim is to catalyze and streamline community-wide progress on this problem by providing the first benchmark and a common experimental protocol for investigating generalization in RL. Our benchmark contains a diverse set of environments and our evaluation methodology covers both in-distribution and out-of-distribution generalization. To provide a set of baselines for future research, we conduct a systematic evaluation of state-of-the-art algorithms, including those that specifically tackle the problem of generalization. The experimental results indicate that in-distribution generalization may be within the capacity of current algorithms, while out-of-distribution generalization is an exciting challenge for future work.
Keywords: reinforcement learning, generalization, benchmark
TL;DR: We provide the first benchmark and common experimental protocol for investigating generalization in RL, and conduct a systematic evaluation of state-of-the-art deep RL algorithms.
Code: [![github](/images/github_icon.svg) sunblaze-ucb/rl-generalization](https://github.com/sunblaze-ucb/rl-generalization)
Data: [Arcade Learning Environment](https://paperswithcode.com/dataset/arcade-learning-environment), [MuJoCo](https://paperswithcode.com/dataset/mujoco), [OpenAI Gym](https://paperswithcode.com/dataset/openai-gym)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/assessing-generalization-in-deep/code)
9 Replies

Loading