Partially observable restless bandits with restarts: indexability and computation of Whittle indexDownload PDFOpen Website

Published: 01 Jan 2022, Last Modified: 15 May 2023CDC 2022Readers: Everyone
Abstract: We consider restless bandits with restarts, where the state of the active arms resets according to a known probability distribution while the state of the passive arms evolves in a Markovian manner. We assume that the state of the arm is observed after it is reset but not observed otherwise. We show that the model is indexable and propose an efficient algorithm to compute the Whittle index by exploiting the qualitative properties of the optimal policy. A detailed numerical study of machine repair models shows that Whittle index policy outperforms myopic policy and is close to optimal policy.
0 Replies

Loading