2014 (modified: 08 Nov 2022)COLT 2014Readers: Everyone
Abstract:The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of ...