Beyond "To Act or Not to Act": Fast Lagrangian Approaches to General Multi-Action Restless Bandits

Jackson Albert Killian, Andrew Perrault, Milind Tambe

09 May 2021OpenReview Archive Direct UploadReaders: Everyone

Abstract: This paper presents new algorithms and theoretical results for solutions to Multi-action Multi-armed Restless Bandits, an important but insufficiently studied generalization of traditional Multi-armed Restless Bandits (MARBs). Though MARBs are popular for modeling many problems, they are restricted to binary actions, i.e., "to act or not to act". This renders them unable to capture critical complexities faced by planners in real domains, such as a system manager balancing maintenance, repair, and job scheduling, or a health worker deciding among treatments for a given patient. Limited previous work on Multi-action MARBs has only been specialized to subproblems. Here we derive multiple algorithms for use on general Multi-action MARBs using Lagrangian relaxation techniques, leading to the following contributions: (i) We develop BLam, a bound optimization algorithm which leverages problem convexity to quickly and provably converge to the well-performing Lagrange policy; (ii) We develop SampleLam, a fast sampling technique for estimating the Lagrange policy, and derive a concentration bound to investigate its convergence properties; (iii) We derive best and worst case computational complexities for our algorithms as well as our main competitor; (iv) We provide experimental results comparing our algorithms to baselines on simulated distributions, including one motivated by a real-world community health intervention task. Our approach achieves significant, up to ten-fold speedups over more general methods without sacrificing performance and is widely applicable across general Multi-action MARBs. Code is availabl

0 Replies