Constrained Linear Best Arm Identification with Covariate Selection

ICLR 2026 Conference Submission13754 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Constrained Best Arm Identification, Covariate Selection, Fixed-Confidence, Sample Complexity
Abstract: This paper studies a constrained linear best arm identification problem with covariate selection in the fixed-confidence setting, where each arm is evaluated across multiple performance metrics. The mean performance of each metric depends linearly on the feature vectors of both arms and covariates. The goal is to identify the arm with the highest expected value of one targeted metric while ensuring that the means of the remaining metrics stay below specified thresholds for each covariate. We first establish an instance-dependent lower bound on the sample complexity, formulated as a multi-level optimization problem that captures both feasibility and optimality. We then prove that this bound is tight by designing an algorithm that asymptotically matches it. Since the original algorithm is computationally intensive, we develop a relaxed version of the bound through a surrogate optimization problem and derive its convex dual. Using this bound, we propose a duality-based decomposition algorithm that is computationally efficient, updating only two coordinates and performing a single gradient step per iteration. We further show that the algorithm achieves the relaxed bound in theory and demonstrates its practical effectiveness through numerical experiments.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 13754
Loading