2021 (modified: 19 May 2022)ICML 2021Readers: Everyone
Abstract:We study the stochastic Multi-Armed Bandit (MAB) problem with random delays in the feedback received by the algorithm. We consider two settings: the {\it reward dependent} delay setting, where real...