Thompson Sampling Achieves $\tilde{O}(\sqrt{T})$ Regret in Linear Quadratic Control

Taylan Kargin, Sahin Lale, Kamyar Azizzadenesheli, Animashree Anandkumar, Babak Hassibi

2022 (modified: 15 Nov 2022)COLT 2022Readers: Everyone

Abstract: Thompson Sampling (TS) is an efficient method for decision-making under uncertainty, where an action is sampled from a carefully prescribed distribution which is updated based on the observed data....

0 Replies