2014 (modified: 11 Nov 2022)ICML 2014Readers: Everyone
Abstract:We present algorithms for reducing the Dueling Bandits problem to the conventional (stochastic) Multi-Armed Bandits problem. The Dueling Bandits problem is an online model of learning with ordinal ...