ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical PerspectivesDownload PDF

12 Oct 2021 (modified: 22 Oct 2023)Deep RL Workshop NeurIPS 2021Readers: Everyone
Keywords: Reinforcement Learning, Deep Reinforcement Learning, Value Iteration, Policy Iteration, Open-Source Library
TL;DR: We present a new open-source library for evaluating RL algorithms from theoretical and practical perspectives.
Abstract: We present ShinRL, an open-source library specialized for the evaluation of reinforcement learning (RL) algorithms from both theoretical and practical perspectives. Existing RL libraries typically allow users to evaluate practical performances of deep RL algorithms through returns. Nevertheless, these libraries are not necessarily useful for analyzing if the algorithms perform as theoretically expected, such as if Q learning really achieves the optimal Q function. In contrast, ShinRL provides an RL environment interface that can compute metrics for delving into the behaviors of RL algorithms, such as the gap between learned and the optimal Q values and state visitation frequencies. In addition, we introduce a solver interface for evaluating both theoretically justified algorithms (e.g., dynamic programming and tabular RL) and practically effective ones (i.e., deep RL, typically with some additional extensions and regularizations) in a consistent fashion. As a case study, we show that how combining these two features of ShinRL makes it easier to analyze the behavior of deep Q learning. Furthermore, we demonstrate that ShinRL can be used to empirically validate some recent theoretical findings such as the effect of KL regularization for value iteration [Kozuno et al., 2019] and for deep Q learning [Vieillard et al., 2020a], and the robustness of entropy-regularized policies to adversarial rewards [Husain et al., 2021]. The ShinRL source code can be found on GitHub: https://github.com/omron-sinicx/ShinRL.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2112.04123/code)
0 Replies

Loading