Actor-Critic based Improper Reinforcement Learning

Mohammadi Zaki, Avi Mohan, Aditya Gopalan, Shie Mannor

2022 (modified: 24 Apr 2023)ICML 2022Readers: Everyone

Abstract: We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potent...

0 Replies