Beyond "To Act or Not to Act": Fast Lagrangian Approaches to General Multi-Action Restless Bandits
Abstract: This paper presents new algorithms and theoretical results for solutions to Multi-action Multi-armed Restless Bandits, an important
but insufficiently studied generalization of traditional Multi-armed
Restless Bandits (MARBs). Though MARBs are popular for modeling many problems, they are restricted to binary actions, i.e., "to act
or not to act". This renders them unable to capture critical complexities faced by planners in real domains, such as a system manager balancing maintenance, repair, and job scheduling, or a health worker
deciding among treatments for a given patient. Limited previous
work on Multi-action MARBs has only been specialized to subproblems. Here we derive multiple algorithms for use on general
Multi-action MARBs using Lagrangian relaxation techniques, leading to the following contributions: (i) We develop BLam, a bound optimization algorithm which leverages problem convexity to quickly
and provably converge to the well-performing Lagrange policy; (ii)
We develop SampleLam, a fast sampling technique for estimating
the Lagrange policy, and derive a concentration bound to investigate its convergence properties; (iii) We derive best and worst case
computational complexities for our algorithms as well as our main
competitor; (iv) We provide experimental results comparing our
algorithms to baselines on simulated distributions, including one
motivated by a real-world community health intervention task. Our
approach achieves significant, up to ten-fold speedups over more
general methods without sacrificing performance and is widely
applicable across general Multi-action MARBs. Code is availabl
0 Replies
Loading