Effective, Explainable, and Robust Policy Design in an Economy-SIR Model using Two-Level Reinforcement Learning

TMLR Paper59 Authors

21 Apr 2022 (modified: 28 Feb 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: Optimizing economic policies in the face of severe economic shocks, such as pandemics, poses a complex multi-agent challenge. Here we show that the AI Economist framework can learn sets of effective, robust, and explainable policies using two-level reinforcement learning (RL) and data-driven simulations. Our framework can optimize for a wide range of (mis)aligned social welfare objectives and accounts for strategic behavioral responses. In contrast, existing analytical and computational methods do not scale to this setting. We validate our framework on optimizing state policies and federal subsidies in an unemployment-vaccination-SIR simulation fitted to US COVID-19 data. We find that log-linear RL policies significantly improve public health and economic metrics compared to real-world outcomes. In particular, federal subsidies can align incentives more between state and federal agents. Their behavior is explainable, e.g., RL policies respond strongly to changes in recovery and vaccination rates. They are also robust to calibration errors, e.g., infection rates that are over or underestimated. As of yet, real-world policymaking has not seen adoption of machine learning methods at large, including RL and AI-driven simulations. Our results show the potential of AI to guide policy design and improve social welfare amidst the complexity of the real world.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Based on the feedback from the reviewers, we've made a preliminary revision: - Adjusted title to remove "explainable" and add "log-linear" - Added modeling assumptions before model description in section 4 - Added SIR and economics equations to section 4 (taken from section 9) - Added discussion on causal inference. - Added discussion on difficulty of reducing to and solving large general-sum Stackelberg games. - Added discussion on results with simple baselines ("always open/closed") - Added discussion on how subsidies couple the objectives of the US states, making the setup a collection of N weakly coupled mechanism design problems. - Other small edits in response to the reviews. - Added more baselines: always-open, always-closed (with/without subsidies), and intermediate version (close for 3, 6, 9, 12 months). - Added more related work on policy analysis and design. - Emphasized that all simple/heuristic baselines underperform the real-world policies (and thus always underperform AI policies).
Assigned Action Editor: ~Michael_Bowling1
Submission Number: 59
Loading