Follow-the-Perturbed-Leader Achieves Best-of-Both-Worlds for Bandit Problems

Junya Honda, Shinji Ito, Taira Tsuchiya

2023 (modified: 24 Apr 2023)ALT 2023Readers: Everyone

Abstract: This paper discusses the adversarial and stochastic $K$-armed bandit problems. In the adversarial setting, the best possible regret is known to be $O(\sqrt{KT})$ for time horizon $T$. This bound ca...

0 Replies