Average-Reward Off-Policy Policy Evaluation with Function ApproximationDownload PDFOpen Website

2021 (modified: 24 Feb 2022)ICML 2021Readers: Everyone
Abstract: We consider off-policy policy evaluation with function approximation (FA) in average-reward MDPs, where the goal is to estimate both the reward rate and the differential value function. For this pr...
0 Replies

Loading