Linear Thompson Sampling Revisited.

Marc Abeille, Alessandro Lazaric

2017 (modified: 09 Nov 2022)AISTATS2017Readers: Everyone

Abstract: We derive an alternative proof for the regret of Thompson sampling (TS) in the stochastic linear bandit setting. While we obtain a regret bound of order $O(d^3/2\sqrtT)$ as in previous results, the...

0 Replies