Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 oralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Gaussian process bandits, regret analysis, Bayesian optimization
Abstract: This paper addresses the Bayesian optimization problem (also referred to as the Bayesian setting of the Gaussian process bandit), where the learner seeks to minimize the regret under a function drawn from a known Gaussian process (GP). Under a Mat\'ern kernel with some extent of smoothness, we show that the Gaussian process upper confidence bound (GP-UCB) algorithm achieves $\tilde{O}(\sqrt{T})$ cumulative regret with high probability. Furthermore, our analysis yields $O(\sqrt{T \ln^2 T})$ regret under a squared exponential kernel. These results fill the gap between the existing regret upper bound of GP-UCB and the current best upper bound provided by Scarlett [2018]. The key idea in our proof is to capture the concentration behavior of the input sequence realized by GP-UCB, enabling us to handle GP's information gain in a refined manner.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 16340
Loading