RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions

Easton Knight Huch; Jieru Shi; Madeline R Abbott; Jessica R Golbus; Alexander Moreno; Walter H. Dempsey

RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions

Easton Knight Huch, Jieru Shi, Madeline R Abbott, Jessica R Golbus, Alexander Moreno, Walter H. Dempsey

Published: 25 Sept 2024, Last Modified: 15 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Bandit Algorithms, Causal Inference, Supervised Learning, mHealth, Mixed-effects Modeling

TL;DR: The authors propose a robust contextual bandit algorithm for optimizing mobile health interventions that leverages (1) mixed effects, (2) nearest-neighbor regularization, and (3) debiased machine learning (DML).

Abstract: Mobile health leverages personalized and contextually tailored interventions optimized through bandit and reinforcement learning algorithms. In practice, however, challenges such as participant heterogeneity, nonstationarity, and nonlinear relationships hinder algorithm performance. We propose RoME, a **Ro**bust **M**ixed-**E**ffects contextual bandit algorithm that simultaneously addresses these challenges via (1) modeling the differential reward with user- and time-specific random effects, (2) network cohesion penalties, and (3) debiased machine learning for flexible estimation of baseline rewards. We establish a high-probability regret bound that depends solely on the dimension of the differential-reward model, enabling us to achieve robust regret bounds even when the baseline reward is highly complex. We demonstrate the superior performance of the RoME algorithm in a simulation and two off-policy evaluation studies.

Supplementary Material: zip

Primary Area: Machine learning for healthcare

Submission Number: 4693

Loading