2022 (modified: 22 Dec 2022)UAI 2022Readers: Everyone
Abstract:We propose a new bootstrap-based online algorithm for stochastic linear bandit problems. The key idea is to adopt residual bootstrap exploration, in which the agent estimates the next step reward b...