Stochastic Bandits for Egalitarian Assignment

Eugene Lim; Vincent Y. F. Tan; Harold Soh

Stochastic Bandits for Egalitarian Assignment

Eugene Lim, Vincent Y. F. Tan, Harold Soh

Published: 12 Oct 2024, Last Modified: 12 Oct 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: We study \texttt{EgalMAB}, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In \texttt{EgalMAB}, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its assigned arm. The agent's objective is to maximize the minimum expected cumulative reward among all users over a fixed horizon. This problem has applications in areas such as fairness in job and resource allocations, among others. We design and analyze a UCB-based policy \texttt{EgalUCB} and establish upper bounds on the cumulative regret. In complement, we establish an almost-matching policy-independent impossibility result.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: We made some final changes for the camera-ready revision.

Supplementary Material: zip

Assigned Action Editor: ~Branislav_Kveton1

Submission Number: 3096

Loading