Thompson Sampling Achieves $\tilde{O}(\sqrt{T})$ Regret in Linear Quadratic ControlDownload PDFOpen Website

2022 (modified: 15 Nov 2022)COLT 2022Readers: Everyone
Abstract: Thompson Sampling (TS) is an efficient method for decision-making under uncertainty, where an action is sampled from a carefully prescribed distribution which is updated based on the observed data....
0 Replies

Loading