Optimal Algorithm for Max-Min Fair Bandit

Zilong Wang; Zhiyao Zhang; Shuai Li

Optimal Algorithm for Max-Min Fair Bandit

Zilong Wang, Zhiyao Zhang, Shuai Li

20 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: multi-player multi-armed bandits, max-min fariness

Abstract: We consider a multi-player multi-armed bandit problem (MP-MAB) where $N$ players compete for $K$ arms in $T$ rounds. The reward distribution is heterogeneous where each player has a different expected reward for the same arm. When multiple players select the same arm, they collide and obtain zero reward. In this paper, we aim to find the max-min fairness matching that maximizes the reward of the player who receives the lowest reward. This paper improves the existing regret upper bound result of $O(\log T\log \log T)$ to achieve max-min fairness. More specifically, our decentralized fair elimination algorithm (DFE) deals with heterogeneity and collision carefully and attains a regret upper bounded of $O((N^2+K)\log T / \Delta)$, where $\Delta$ is the minimum reward gap between max-min value and sub-optimal arms. We assume $N\leq K$ to guarantee all players can select their arms without collisions. In addition, we also provide an $\Omega(\max\{N^2, K\} \log T / \Delta)$ regret lower bound for this problem. This lower bound indicates that our algorithm is optimal with respect to key parameters, which significantly improves the performance of algorithms in previous work. Numerical experiments again verify the efficiency and improvement of our algorithms.

Supplementary Material: zip

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2079

Loading