Contextual Budget Bandit for Food Rescue Volunteer Engagement

Published: 28 Nov 2025, Last Modified: 30 Nov 2025NeurIPS 2025 Workshop MLxOREveryoneRevisionsBibTeXCC BY 4.0
Keywords: food rescue, restless bandit, budget allocation
TL;DR: We introduce algorithms for contextual budget allocation in restless multi-armed bandits to address reward disparity across context groups in crowdsourcing food rescue volunteer engagement.
Abstract: Restless multi-armed bandits (RMABs) are an extension of the multi-armed bandit framework where pulling an arm results in both a reward and a Markovian state change. While RMABs are being employed in domains including public health and food insecurity, its objective of maximizing the cumulative reward comes at the expense of reward disparity across different context groups. To address this issue, we introduce contextual budget allocation, which optimizes the budget amount across different contexts, along with the traditional budget allocation within each context. This allows higher-need groups to receive larger budgets. We develop a set of novel policies: (1) COcc, an empirically fast heuristic algorithm based on the Whittle index policy, and (2) Mitosis, a provably optimal algorithm that combines a branch-and-bound search structure and no-regret algorithm framework. We conduct extensive experiments with synthetic and real food rescue datasets.
Submission Number: 52
Loading