Keywords: Two-sided matching markets, Stable matching, Pure exploration
TL;DR: We study algorithms that aim to find Probably Correct Optimal Stable Matchings with high probability, in a setting where agents’ preferences are learned through repeated interaction using bandit feedback.
Abstract: We consider a learning problem for the stable marriage model under unknown preferences for the left side of the market. We focus on the centralized case, where at each time step, an online platform matches the agents, and obtains a noisy evaluation reflecting their preferences. Our aim is to quickly identify the stable matching that is left-side optimal, rendering this a pure exploration problem with bandit feedback. We specifically aim to find Probably Correct Optimal Stable Matchings and present several bandit algorithms to do so. Our findings provide a foundational understanding of how to efficiently gather and utilize preference information to identify the optimal stable matching in two-sided markets under uncertainty. An experimental analysis on synthetic data complements theoretical results on sample complexities for the proposed methods.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~andreas_athanasopoulos1, ~Christos_Dimitrakakis1
Track: Fast Track: published work
Publication Link: andreas.athanasopoulos@unine.ch, https://openreview.net/forum?id=1na8OQ7AAJ&referrer
Submission Number: 108
Loading