Multi-thresholding Good Arm Identification with Bandit Feedback

Xuanke Jiang; Sherief Hashima; Kohei Hatano; Eiji Takimoto

Multi-thresholding Good Arm Identification with Bandit Feedback

Xuanke Jiang, Sherief Hashima, Kohei Hatano, Eiji Takimoto

Published: 01 Sept 2025, Last Modified: 18 Nov 2025ACML 2025 Conference TrackEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We consider a good arm identification problem in a stochastic bandit setting with multi-objectives, where each arm $i\in[K]$ is associated with a distribution $\mathcal{D_{i}}$ defined over $\mathbb{R}^M$. For each round $t$, the player/algorithm pulls one arm $i_t$ and receives a $M$ dimensional vector feedback sampled according to $\mathcal{D_{i_t}}$. The target is twofold, one is finding one arm whose means are larger than the predefined thresholds $\xi_1,\ldots,\xi_M$ with a confidence bound $\delta$ and an accuracy rate $\epsilon$ with a bounded sample complexity, the other is output $\bot$ to indicate no such arm exists. We propose an algorithm with a sample complexity bound. Our bound is the same as the one given in the previous work when $M=1$ and $\epsilon = 0$, and we give novel bounds for $M > 1$ and $\epsilon > 0$. The proposed algorithm attains better numerical performance than other baselines in the experiments on synthetic and real datasets.

Supplementary Material: pdf

Submission Number: 29

Loading