lil' UCB : An Optimal Exploration Algorithm for Multi-Armed BanditsDownload PDFOpen Website

2014 (modified: 08 Nov 2022)COLT 2014Readers: Everyone
Abstract: The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of ...
0 Replies

Loading