[Re] Stabilizing Off-Policy Q-Learning via Bootstrapping Error ReductionDownload PDF

02 Dec 2019 (modified: 05 May 2023)NeurIPS 2019 Reproducibility Challenge Blind ReportReaders: Everyone
Abstract: In this paper, we reproduce the main results of the paper, Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction, including the performance of baseline algorithms as well as BEAR-QL. We analyze and compare our results with those in the paper, empirically show that BEAR-QL can learn from the random dataset, achieve optimal or suboptimal performance with optimal or medium quality datasets in different continuous control tasks, and provide practical suggestions to reproduce the results of the paper.
Track: Replicability
NeurIPS Paper Id: https://openreview.net/forum?id=H1xutHSxLS&noteId=S1gUDPe0tr
5 Replies

Loading