A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

Pratik Gajane, Tanguy Urvoy, Fabrice Clérot

2015 (modified: 11 Nov 2022)ICML 2015Readers: Everyone

Abstract: We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of a...

0 Replies