Keywords: Decision Focused Learning, Restless Multi-Armed Bandits, Predict-then-optimize, AI for Social Impact, Population Health
TL;DR: Decision Focused Learning (DFL) solves this objective mismatch issue in optimisation problems with unknown in the learning pipeline.
Abstract: Many real world optimization problems with underlying unknown model parameters are solved using the predict-then-optimize framework. In particular, a model is learnt to first predict the parameters of the optimization problem, which is subsequently solved using an optimization algorithm. However, this approach maximises for the predictive accuracy rather than the quality of the final solution. Decision Focused Learning (DFL) solves this objective mismatch by integrating the optimization problem in the learning pipeline. Previous works have only shown the applicability of DFL in simulation settings. In our work, we consider the optimization problem of scheduling limited live service calls in Maternal and Child Health Awareness Programs and model it using Restless Multi-Armed Bandits (RMAB). In collaboration with an NGO, we conduct a large-scale field study consisting of 9000 beneficiaries for 6 weeks and track key engagement metrics in a mobile health awareness program. To the best of our knowledge this is the first real world study involving Decision Focused Learning. We demonstrate that beneficiaries in the DFL group experience statistically significant reductions in cumulative engagement drop, while those in the Predict-then-Optimize group do not. This establishes the practicality of use of decision focused learning for real world problems. We also demonstrate that DFL learns a better decision boundary between the RMAB actions, and strategically predicts parameters which contribute most to the final decision outcome.