Efficient Prior Selection in Gaussian Process Bandits with Thompson Sampling

Published: 17 Jul 2025, Last Modified: 06 Sept 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-armed bandits, Gaussian processes, Thompson sampling, Hyperparameter optimization, Bayesian optimization
TL;DR: We propose, analyze and evaluate two algorithms for joint prior selection and regret minimization based on Gaussian process Thompson sampling.
Abstract: Gaussian process (GP) bandits provide a powerful framework for solving blackbox optimization of unknown functions. The characteristics of the unknown function depend heavily on the assumed GP prior. Most work in the literature assume that this prior is known but in practice this seldom holds. Instead, practitioners often rely on maximum likelihood estimation to select the hyperparameters of the prior - which lacks theoretical guarantees. In this work, we propose two algorithms for joint prior selection and regret minimization in GP bandits based on GP Thompson sampling (GP-TS): Prior-Elimination GP-TS (PE-GP-TS) and HyperPrior GP-TS (HP-GP-TS). We theoretically analyze the algorithms and establish upper bounds for the regret of HP-GP-TS. In addition, we demonstrate the effectiveness of our algorithms compared to the alternatives through experiments with synthetic and real-world data.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Jack_Sandberg1
Track: Regular Track: unpublished work
Submission Number: 87
Loading