Fixed-Confidence Multiple Change Point Identification under Bandit Feedback

Joseph Lazzaro; Ciara Pike-Burke

Fixed-Confidence Multiple Change Point Identification under Bandit Feedback

Joseph Lazzaro, Ciara Pike-Burke

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Abstract: Piecewise constant functions describe a variety of real-world phenomena in domains ranging from chemistry to manufacturing. In practice, it is often required to confidently identify the locations of the abrupt changes in these functions as quickly as possible. For this, we introduce a fixed-confidence piecewise constant bandit problem. Here, we sequentially query points in the domain and receive noisy evaluations of the function under bandit feedback. We provide instance-dependent lower bounds for the complexity of change point identification in this problem. These lower bounds illustrate that an optimal method should focus its sampling efforts adjacent to each of the change points, and the number of samples around each change point should be inversely proportional to the magnitude of the change. Building on this, we devise a simple and computationally efficient variant of Track-and-Stop and prove that it is asymptotically optimal in many regimes. We support our theoretical findings with experimental results in synthetic environments demonstrating the efficiency of our method.

Lay Summary: In many settings such as chemistry or manufacturing, tiny changes in the experimental input (e.g. temperature) can lead to large changes in the output (e.g. yield). Often it is required to locate the changes in input that lead to these big jumps in output. Moreover, it is often expensive to run these experiments at every possible input value. We therefore aim to find methods to locate the jumps as efficiently as possible. To tackle this problem, we construct methods to sequentially choose an input (e.g. temperature) at which we will observe a noisy output. We can use previous observations to carefully select these points. We repeat the process until we can confidently stop and return the location of the set of jumps or change points. In this paper, we mathematically upper bound the expected time for our proposed methods to stop. We then show that these expected stopping times are theoretically optimal when we want to have high confidence that our returned change point locations are correct. We complement these results with some experiments showing our methods are effective in practice.

Primary Area: Theory->Online Learning and Bandits

Keywords: bandits, pure exploration, fixed confidence, change points

Submission Number: 14275

Loading