Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDPDownload PDFOpen Website

Kefan Dong, Yuanhao Wang, Xiaoyu Chen, Liwei Wang

23 Sept 2020 (modified: 05 May 2023)ICLR 2020Readers: Everyone
0 Replies

Loading