Variance-Adaptive Algorithm for Probabilistic Maximum Coverage Bandits with General Feedback

Published: 2023, Last Modified: 31 Jan 2025INFOCOM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Probabilistic maximum coverage (PMC) is an important problem that can model many network applications, including mobile crowdsensing, network content delivery, and dynamic channel allocation, where an operator chooses nodes in a graph that can probabilistically cover other nodes. In this paper, we study PMC under the online learning context: the PMC bandit. For PMC bandit where network parameters are not known a priori, the decision maker needs to learn the unknown parameters and the goal is to maximize the total rewards from the covered nodes. Though PMC bandit has been studied previously, the existing model and its corresponding algorithm can be significantly improved. First, we propose the PMC-G bandit whose feedback model generalizes existing semi-bandit feedback, allowing PMC bandit to model applications like online content delivery and online dynamic channel allocation. Next, we improve the existing combinatorial upper confidence bound (CUCB) algorithm by introducing the variance-adaptive algorithm, i.e., the VA-CUCB algorithm. We prove that VA-CUCB can achieve strictly better regret bounds, which improves CUCB by a factor of $\tilde O(K)$, where K is the number of nodes selected in each round. Finally, experiments show our superior performance compared with benchmark algorithms on synthetic and real-world datasets.
Loading