2019 (modified: 09 Nov 2022)AISTATS2019Readers: Everyone
Abstract:In stochastic multi-armed bandits, the reward distribution of each arm is assumed to be stationary. This assumption is often violated in practice (e.g., in recommendation systems), where the reward...