Collaborative Top Distribution Identifications with Limited Interaction (Extended Abstract)

Nikolai Karpov, Qin Zhang, Yuan Zhou

2020 (modified: 23 Jan 2023)FOCS 2020Readers: Everyone

Abstract: We consider the following problem in this paper: given a set of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$n$</tex> distributions, find the top- <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$m$</tex> ones with the largest means. This problem is also called top- <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$m$</tex> arm identifications in the literature of reinforcement learning, and has numerous applications. We study the problem in the collaborative learning model where we have multiple agents who can draw samples from the <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$n$</tex> distributions in parallel. Our goal is to characterize the tradeoffs between the running time of learning process and the number of rounds of interaction between agents, which is very expensive in various scenarios. We give optimal time-round tradeoffs, as well as demonstrate complexity separations between top-1 arm identification and top- <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$m$</tex> arm identifications for general <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$m$</tex> and between fixed-time and fixed-confidence variants. As a byproduct, we also give an algorithm for selecting the distribution with the <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$m$</tex> -th largest mean in the collaborative learning model.

0 Replies