A Hierarchical Nearest Neighbour Approach to Contextual Bandits

TMLR Paper5312 Authors

06 Jul 2025 (modified: 17 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: In this paper we consider the contextual bandit problem in metric spaces. We design and analyse an algorithm that can handle the fully adversarial problem in which no assumptions are made about the space itself, or the generation of contexts and losses. In addition to analysing our performance on general metric spaces, we further analyse the important special case in which the space is euclidean, and furthermore analyse the i.i.d. stochastic setting. Unlike previous work our algorithm is adaptive to the local density of contexts and the smoothness of the decision boundary of the comparator policy, as well as other quantities. Our algorithm is highly efficient - having a per-trial time polylogarithmic in both the number of trials and the number of actions when the dimensionality of the metric space is bounded. We also give the results of real world experiments, demonstrating the excellent performance of our algorithm.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Zheng_Wen1
Submission Number: 5312
Loading