[Re] A Family of Robust Stochastic Operators for Reinforcement LearningDownload PDF

02 Dec 2019 (modified: 05 May 2023)NeurIPS 2019 Reproducibility Challenge Blind ReportReaders: Everyone
Abstract: We replicate a new family of robust stochastic operators (RSO) proposed in~\cite{paper}. In reinforcement learning, the presence of approximation/estimation errors can increase the probability of sub-optimal actions being taken. One way to reduce the effects of these errors on performance is to increase the action gap, or the value difference between the best and next-best actions. RSO learning increases the action gap and as a result should be robust to approximation/estimation errors. The original paper tests the operator on problems with substantial approximation errors to demonstrate its effectiveness. We recreate the graph showing the performance of RSO relative to other operators on Mountain Car, and demonstrate that RSO performs better than the other operators regardless of $\epsilon$. We additionally show that, even when optimal hyperparameters are selected for other operators, RSO still performs substantially better. Finally, we attempt to replicate the superiority of the authors' preferred RSO parameterization, but find that an alternative parameterization performs better.
Track: Replicability
NeurIPS Paper Id: https://openreview.net/forum?id=B1M-ASSgUS
4 Replies

Loading