Orthogonal Function Representations for Continuous Armed Bandits

Davide Maran; Marcello Restelli

Orthogonal Function Representations for Continuous Armed Bandits

Davide Maran, Marcello Restelli

19 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Continuous armed bandit, Orthogonal functions, Linear bandits, Smoothness

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: In this paper we use orthogonal functions to design two efficient algorithms for the countinuous armed bandit problem.

Abstract: This paper addresses the continuous-armed bandit problem, which is a generalization of the standard bandit problem where the action space is a d−dimensional hypercube $X = [−1, 1]^d$ and the reward is an s−times differentiable function $f : \mathcal X → \mathbb R$. Traditionally, this problem is solved by assuming an implicit feature representation in a Reproducing Kernel Hilbert Space (RKHS), where the objective function is linear in this transformation of $\mathcal X$ . In addition to this additional intake, this comes at the cost of overwhelming computational complexity. In contrast, we propose an explicit representation using an orthogonal feature map (Fourier, Legendre) to reduce the problem to a linear bandit with misspecification. As a result, we develop two algorithms _OB-LinUCB_ and _OB-PE_, achieving state-of-the-art performance in terms of regret and computational complexity.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: pdf

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1798

Loading