Keywords: Offline Reinforcement Learning, Infectious Diseases, Multi-agent Reinforcement Learning, Human Mobility
TL;DR: Our work utilizes offline multi-agent reinforcement learning, human mobility data, and epidemic modeling to train a population of agents to mitigate disease spread by optimizing their mobility.
Abstract: The COVID-19 pandemic generates new real-world data-driven problems such as predicting case surges, managing resource depletion, or modeling geo-spatial infection spreading. Though reinforcement learning (RL) has been previously proposed to optimize regional lock-downs, the availability of mobility tracking data with offline RL allows us to push decision making from the top-down perspective (i.e., driven by governments) to the bottom up perspective (i.e., driven by individuals). Rather than predicting the outcome of the outbreak, we utilize offline RL as a tool, along with epidemic modeling, to empower collaborative decision-making at the individual level. In our investigations, we ask whether we can train the population of a city to become more resilient against infectious diseases? To investigate, we deploy a 'city' of 10,000 agents loaded with real visits at Points of Interest (POIs) (e.g., restaurants, gyms, parks) throughout a target metropolitan area during the COVID-19 pandemic (July 2020). Using a standard disease compartmental model, we find that the city of trained agents can reduce disease transmissions by 60%. This opens a new direction in using offline RL as a springboard to further the research at the intersection of artificial intelligence and disease mitigation.