Large-Scale Bandit Problems and KWIK LearningDownload PDFOpen Website

2013 (modified: 08 Nov 2022)ICML (1) 2013Readers: Everyone
Abstract: We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK lea...
0 Replies

Loading