Stochastic Matching Bandits under Preference Feedback

Jung-hun Kim; Min-hwan Oh

Stochastic Matching Bandits under Preference Feedback

Jung-hun Kim, Min-hwan Oh

22 Sept 2024 (modified: 21 Jan 2025)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Matching bandits, Preference Feedback

TL;DR: In this study, we propose a new bandit framework of stochastic matching employing the Multinomial Logit (MNL) choice model with feature information.

Abstract: In this study, we propose a new bandit framework of stochastic matching employing the Multinomial Logit (MNL) choice model with feature information. In this framework, agents on one side are assigned to arms on the other side, and each arm stochastically accepts an agent among the assigned pool of agents based on its unknown preference, allowing a possible outside option of not accepting any. The objective is to minimize regret by maximizing the probability of successful matching. For this framework, we first propose an elimination-based algorithm that achieves a regret bound of $\tilde{O}\big(K\sqrt{rKT} \big)$ over time horizon $T$, where $K$ is the number of arms and $r$ is the rank of feature space. Furthermore, we propose an approach to resolve the computation issue regarding combinatorial optimization in the algorithm. Lastly, we evaluate the performances of our algorithm through experiments comparing with the existing showing the superior performances of our algorithm.

Supplementary Material: zip

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2717

Loading