Efficient Online Learning under Bandit FeedbackDownload PDFOpen Website

2018 (modified: 04 Nov 2022)undefined 2018Readers: Everyone
Abstract: In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function ...
0 Replies

Loading