OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Go to
ICML 2020
homepage
From Importance Sampling to Doubly Robust Policy Gradient
Jiawei Huang
,
Nan Jiang
2020 (modified: 19 Apr 2023)
ICML 2020
Readers:
Everyone
Abstract:
We show that on-policy policy gradient (PG) and its variance reduction variants can be derived by taking finite-difference of function evaluations supplied by estimators from the importance samplin...
0 Replies
Loading