Keywords: Best arm identification, large deviation principles.
Abstract: Experimental design is crucial in evidence-based decision-making with multiple treatment arms, such as online advertisements and medical treatments. This study investigates an experiment whose task is to identify the best treatment arm with the highest expected outcome. In our experiments, given a fixed sequence of sample- allocation rounds and multiple treatment arms, we allocate a sample to a treatment arm and ob- serve a corresponding outcome at each round. At the end of the experiment, we recommend one of the treatment arms as the best based on the observations. We aim to design an experiment that minimizes the probability of misidentifying the best treatment arm. This problem has been explored under various names across numerous research fields, including best arm identification (BAI) and ordinal optimization. With this objective in mind, we initially derive lower bounds for the probability of misidentification through an information- theoretic approach, enabling discussions on the asymptotic optimality of experiments. In our analysis, we discover that the available information on the distribution of rewards for each treatment arm significantly influences the asymptotic optimality of experiments. Moreover, we find that the asymptotic optimality depends on a pre-specified set of hypothetical best treatment arms utilized for sample allocation. Existing experiments be- come asymptotically optimal when the true best treatment arm is in the set. The standard BAI is a special case in which all treatment arms are hypothetical best treatment arms. Based on the lower bounds, we design experiments whose probability of misidentification matches the lower bounds given the available information.
Submission Number: 94
Loading