Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

Philip S. Thomas, Emma Brunskill

2016 (modified: 11 Nov 2022)ICML 2016Readers: Everyone

Abstract: In this paper we present a new way of predicting the performance of a reinforcement learning policy given historical data that may have been generated by a different policy. The ability to evaluate...

0 Replies