Bayesian Analysis of Combinatorial Gaussian Process Bandits

Jack Sandberg; Niklas Åkerblom; Morteza Haghir Chehreghani

Bayesian Analysis of Combinatorial Gaussian Process Bandits

Jack Sandberg, Niklas Åkerblom, Morteza Haghir Chehreghani

Published: 22 Jan 2025, Last Modified: 16 May 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-armed bandits, Combinatorial bandits, Contextual bandits, Gaussian processes, Energy-efficient navigation

TL;DR: We present novel Bayesian regret bounds for GP-UCB, GP-BayesUCB and GP-TS for the combinatorial volatile Gaussian process semi-bandit problem and study the application of online energy-efficient navigation.

Abstract: We consider the combinatorial volatile Gaussian process (GP) semi-bandit problem. Each round, an agent is provided a set of available base arms and must select a subset of them to maximize the long-term cumulative reward. We study the Bayesian setting and provide novel Bayesian cumulative regret bounds for three GP-based algorithms: GP-UCB, GP-BayesUCB and GP-TS. Our bounds extend previous results for GP-UCB and GP-TS to the \emph{infinite}, \emph{volatile} and \emph{combinatorial} setting, and to the best of our knowledge, we provide the first regret bound for GP-BayesUCB. Volatile arms encompass other widely considered bandit problems such as contextual bandits. Furthermore, we employ our framework to address the challenging real-world problem of online energy-efficient navigation, where we demonstrate its effectiveness compared to the alternatives.

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 354

Loading