From Importance Sampling to Doubly Robust Policy GradientDownload PDFOpen Website

2020 (modified: 19 Apr 2023)ICML 2020Readers: Everyone
Abstract: We show that on-policy policy gradient (PG) and its variance reduction variants can be derived by taking finite-difference of function evaluations supplied by estimators from the importance samplin...
0 Replies

Loading