Off-policy Model-based Learning under Unknown Factored Dynamics

Assaf Hallak, François Schnitzler, Timothy Arthur Mann, Shie Mannor

2015 (modified: 11 Nov 2022)ICML 2015Readers: Everyone

Abstract: Off-policy learning in dynamic decision problems is essential for providing strong evidence that a new policy is better than the one in use. But how can we prove superiority without testing the new...

0 Replies