Bandit Learning in Many-to-One Matching MarketsDownload PDF

12 May 2023OpenReview Archive Direct UploadReaders: Everyone
Abstract: The problem of two-sided matching markets is well-studied in social science and economics. Some recent works study how to match while learning the unknown preferences of agents in one-to-one matching markets. However, in many cases like the online recruitment platform for short-term workers, a company can select more than one agent while an agent can only select one company at a time. These short-term workers try many times in different companies to find the most suitable jobs for them. Thus we consider a more general bandit learning problem in many-to-one matching markets where each arm has a fixed capacity and agents make choices with multiple rounds of iterations. We develop algorithms in both centralized and decentralized settings and prove regret bounds of order $O(\log T)$ and $O(\log^2 T)$ respectively. Extensive experiments show the convergence and effectiveness of our algorithms.
0 Replies

Loading