Piecewise-Stationary Bandits with Knapsacks

Xilin Zhang; Wang Chi Cheung

Piecewise-Stationary Bandits with Knapsacks

Xilin Zhang, Wang Chi Cheung

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: non-stationary bandits, bandits with knapsacks, competitive ratio

TL;DR: We propose novel inventory-reserve algorithms for a piecewise-stationary BwK problem, achieving provably near-optimal competitive ratios adapting to available information against the dynamic benchmark without requiring bounded global variation.

Abstract: We study Bandits with Knapsacks (Bwk) in a piecewise-stationary environment. We propose a novel inventory reserving algorithm which draws new insights into the problem. Suppose parameters $\eta_{\min}, \eta_{\max} \in (0,1]$ respectively lower and upper bound the reward earned and the resources consumed in a time round. Our algorithm achieves a provably near-optimal competitive ratio of $O(\log(\eta_{\max}/\eta_{\min}))$, with a matching lower bound provided. Our performance guarantee is based on a dynamic benchmark, distinguishing our work from existing works on adversarial Bwk who compare with the static benchmark. Furthermore, different from existing non-stationary Bwk work, we do not require a bounded global variation.

Primary Area: Bandits

Submission Number: 8619

Loading