Further Optimal Regret Bounds for Thompson Sampling

Shipra Agrawal, Navin Goyal

2013 (modified: 08 Nov 2022)AISTATS 2013Readers: Everyone

Abstract: Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after severa...

0 Replies