Optimal Order Simple Regret for Gaussian Process BanditsDownload PDF

21 May 2021, 20:43 (edited 25 Oct 2021)NeurIPS 2021 PosterReaders: Everyone
  • Keywords: Gaussian Process Bandit, Confidence Intervals, RKHS, Optimal Order Simple Regret
  • TL;DR: We prove order optimal simple regret for the Gaussian process bandit problem when the objective function is in a reproducing kernel Hilbert space (RKHS).
  • Abstract: Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function $f$. The problem can be cast as a Gaussian Process (GP) bandit where $f$ lives in a reproducing kernel Hilbert space (RKHS). The state of the art analysis of several learning algorithms shows a significant gap between the lower and upper bounds on the simple regret performance. When $N$ is the number of exploration trials and $\gamma_N$ is the maximal information gain, we prove an $\tilde{\mathcal{O}}(\sqrt{\gamma_N/N})$ bound on the simple regret performance of a pure exploration algorithm that is significantly tighter than the existing bounds. We show that this bound is order optimal up to logarithmic factors for the cases where a lower bound on regret is known. To establish these results, we prove novel and sharp confidence intervals for GP models applicable to RKHS elements which may be of broader interest.
  • Supplementary Material: pdf
  • Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
  • Code: zip
13 Replies