Abstract: In this paper, we applied Reinforcement Learning to optimize Contingency Management (CM) in substance use disorders (SUDs) treatment. CM is a behavioral treatment using financial incentives to encourage abstinence. We developed new reinforcement learning frameworks that clinical decision-makers can choose actions, CM prize structures, and maximize substance abstinence behavior of participants. The choice of the voucher or prize was modeled as a sequential decision, with two parameters: the probability of receiving a prize for each lottery draw and the escalation rule determining the number of draws. Reinforcement Learning with off-policy evaluation was used to estimate outcomes for different CM schemas from observed clinical trial data. We searched CM schemas that maximize treatment outcomes with budget constraints. Using the proposed framework, we analyzed the National Drug Treatment Clinical Trials Network-0007 data to construct unbiased estimators for new CM schemas. Our results indicated that the optimal CM schema would improve treatment outcome: optimal CM schema improved treatment outcomes by 32% while maintaining overall financial cost. Our methods and results have broad applications in future clinical trial planning and translational investigations on the behavioral treatment of SUDs.
Loading