Keywords: Fairness, predict-then-optimize, decision-focused-learning, restless bandits
TL;DR: We develop a decision-focused-learning pipeline to solve equitable RMABs, using a novel budget allocation algorithm to prevent disparity between groups.
Abstract: Restless multi-arm bandits (RMABs) are a model for sequentially allocating a limited number of resources to agents modeled as Markov Decision Processes. RMABs have applications in cellular networks, anti-poaching, and in particular, healthcare. For such high-stakes use cases, allocations are often required to treat different groups of agents (e.g., defined by sensitive attributes) fairly. In addition to the fairness challenge, agents' transition probabilities are often unknown and need to be learned in real-world problems.
Thus, group fairness in RMABs requires us to simultaneously learn transition probabilities and how much budget we allocate to each group. Overcoming this key challenge ignored by previous work, we develop a decision-focused-learning pipeline to solve equitable RMABs, using a novel budget allocation algorithm to prevent disparity between groups. Our results on both synthetic and real-world large-scale datasets demonstrate that incorporating fair planning into the learning step greatly improves equity with little sacrifice in utility.
List Of Authors: Verma, Shresth and Zhao, Yunfan and Shah, Sanket and Boehmer, Niclas and Taneja, Aparna and Tambe, Milind
Latex Source Code: zip
Signed License Agreement: pdf
Submission Number: 499
Loading