Dispatching Ambulances using Deep Reinforcement Learning

Jon Elias Moen; Ole Jakob Mengshoel

Dispatching Ambulances using Deep Reinforcement Learning

Jon Elias Moen, Ole Jakob Mengshoel

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: applications to robotics, autonomy, planning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Emergency Medical Service, Ambulance Dispatch, Deep Reinforcement Learning, Proximal Policy Optimization

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We develop an ambulance-dispatching method based on Proximal Policy Optimization and demonstrate outperformance relative to commonly-used heuristic policies such as dispatching the closest ambulance by Haversine or Euclidean distance.

Abstract: Emergency Medical Service (EMS) plays an essential role in today's society. One EMS component is ambulance dispatch, which impacts the ambulance's response time for a medical incident. Fast response times are essential. The problem of ambulance dispatching differs from a typical Vehicle Routing Problem (VRP) since patients arrive stochastically, making the problem hard to solve. In addition to minimizing response time, EMS providers seek optimal resource utilization and good working conditions for EMS personnel while often experiencing an increase in demand. To meet these requirements, this work develops a Reinforcement learning (RL) method based on Proximal Policy Optimization (PPO) for the ambulance dispatching problem. Varying incident priorities along with more flexible incident queue management are also integrated into our novel method. Our PPO-based method and an EMS simulation model are implemented in Python and combined with Open Street Map (OSM) travel time estimation and simple synthetic incident data generation. Empirical results are presented using both synthetic and real incident data. Results using real incident data from the Oslo University Hospital (OUH) in Norway suggest that our PPO model outperforms heuristic policies such as dispatching the closest ambulance by Haversine or Euclidean distance. We hope that this work inspires future research on RL for ambulance dispatch and ultimately leads to improved decision-support tools for EMS in Norway and elsewhere.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5316

Loading