MDP Playground: A Design and Debug Testbed for Reinforcement LearningDownload PDF

07 Jun 2021 (modified: 24 May 2023)Submitted to NeurIPS 2021 Datasets and Benchmarks Track (Round 1)Readers: Everyone
Keywords: Reinforcement learning, Core issues, Efficiency, Reproducibility, Dimensions of hardness, OpenAI Gym, Benchmarks
TL;DR: Platform to design and debug Reinforcement Learning (RL) agents with controllable hardness dimensions
Abstract: We present \emph{MDP Playground}, an efficient testbed for Reinforcement Learning (RL) agents with \textit{orthogonal} dimensions that can be controlled independently to challenge agents in different ways and obtain varying degrees of hardness in generated environments. We consider and allow control over a wide variety of dimensions, including \textit{delayed rewards}, \textit{rewardable sequences}, \textit{density of rewards}, \textit{stochasticity}, \textit{image representations}, \textit{irrelevant features}, \textit{time unit}, \textit{action range} and more. We define a parameterised collection of fast-to-run toy environments in \textit{OpenAI Gym} by varying these dimensions and propose to use these for the initial design and development of agents. We also provide wrappers that inject these dimensions into complex environments from \textit{Atari} and \textit{Mujoco} to allow for evaluating agent robustness. We further provide various example use-cases and instructions on how to use \textit{MDP Playground} to design and debug agents. We believe that \textit{MDP Playground} is a valuable testbed for researchers designing new, adaptive and intelligent RL agents and those wanting to unit test their agents.
Supplementary Material: zip
URL: https://github.com/automl/mdp-playground
19 Replies

Loading