Data-Efficient Policy Evaluation Through Behavior Policy SearchDownload PDFOpen Website

2017 (modified: 11 Nov 2022)ICML 2017Readers: Everyone
Abstract: We consider the task of evaluating a policy for a Markov decision process (MDP). The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance. We show...
0 Replies

Loading