A Finite Time Analysis of Thompson Sampling for Bayesian Optimization with Preferential Feedback

Published: 03 Feb 2026, Last Modified: 06 Feb 2026AISTATS 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Preference feedback---pairwise comparisons instead of scalar scores---has seen growing use in applications such as human-, lab-, expert-in-the-loop design and scientific discoveries. We propose a Thompson Sampling (TS) approach to Bayesian optimization with preferential feedback that models comparisons through a monotone link on latent utility differences and leverages the dueling kernel induced by a base kernel. We give a finite-time analysis showing that its performance matches that of standard TS for conventional Bayesian optimization with scalar feedback. The analysis exploits TS's anchor-invariance for challenger selection and introduces a double-TS pairing variant. We also demonstrate the performance on both synthetic and real examples.
Submission Number: 445
Loading