Rotting bandits are no harder than stochastic ones.Download PDFOpen Website

2019 (modified: 09 Nov 2022)AISTATS2019Readers: Everyone
Abstract: In stochastic multi-armed bandits, the reward distribution of each arm is assumed to be stationary. This assumption is often violated in practice (e.g., in recommendation systems), where the reward...
0 Replies

Loading