Efficient Online Learning under Bandit Feedback

Stefan Magureanu

2018 (modified: 04 Nov 2022)undefined 2018Readers: Everyone

Abstract: In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function ...

0 Replies