[Re] Stabilizing Off-Policy Q-Learning via Bootstrapping Error ReductionDownload PDF

Dec 02, 2019 (edited Oct 13, 2020)NeurIPS 2019 Reproducibility Challenge Blind ReportReaders: Everyone
  • Abstract: In this paper, we reproduce the main results of the paper, Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction, including the performance of baseline algorithms as well as BEAR-QL. We analyze and compare our results with those in the paper, empirically show that BEAR-QL can learn from the random dataset, achieve optimal or suboptimal performance with optimal or medium quality datasets in different continuous control tasks, and provide practical suggestions to reproduce the results of the paper.
  • Track: Replicability
  • NeurIPS Paper Id: https://openreview.net/forum?id=H1xutHSxLS&noteId=S1gUDPe0tr
5 Replies

Loading