PAC Identification of a Bandit Arm Relative to a Reward QuantileOpen Website

2017 (modified: 02 Mar 2020)AAAI 2017Readers: Everyone
Abstract: We propose a PAC formulation for identifying an arm in an n-armed bandit whose mean is within a fixed tolerance of the m-th highest mean. This setup generalises a previous formulation with m
0 Replies

Loading