Further Optimal Regret Bounds for Thompson SamplingDownload PDFOpen Website

2013 (modified: 08 Nov 2022)AISTATS 2013Readers: Everyone
Abstract: Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after severa...
0 Replies

Loading