Learning to Stop: Deep Learning for Mean Field Optimal Stopping

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Optimal stopping is a fundamental problem in optimization with applications in risk management, finance, robotics, and machine learning. We extend the standard framework to a multi-agent setting, named multi-agent optimal stopping (MAOS), where agents cooperate to make optimal stopping decisions in a finite-space, discrete-time environment. Since solving MAOS becomes computationally prohibitive as the number of agents is very large, we study the mean-field optimal stopping (MFOS) problem, obtained as the number of agents tends to infinity. We establish that MFOS provides a good approximation to MAOS and prove a dynamic programming principle (DPP) based on mean-field control theory. We then propose two deep learning approaches: one that learns optimal stopping decisions by simulating full trajectories and another that leverages the DPP to compute the value function and to learn the optimal stopping rule using backward induction. Both methods train neural networks to approximate optimal stopping policies. We demonstrate the effectiveness and the scalability of our work through numerical experiments on 6 different problems in spatial dimension up to 300. To the best of our knowledge, this is the first work to formalize and computationally solve MFOS in discrete time and finite space, opening new directions for scalable MAOS methods.
Lay Summary: Imagine you’re playing a game where you have to decide the perfect moment to stop — whether it's selling a stock, ending a robot’s task, or choosing when to share data. This “when to stop” question is called an optimal stopping problem, and it becomes much more challenging when multiple players (or agents) must make decisions together. Our research addresses this challenge by examining how large groups of decision-makers can collaborate to determine the optimal stopping times. But solving this exactly for many agents quickly becomes too complex. So, we look at what happens when the number of agents becomes very large — and use that insight to simplify the problem. We built two deep learning tools that help these agents learn when to stop by training neural networks. One simulates entire scenarios, and the other breaks the problem down using a mathematical shortcut called dynamic programming. This work opens the door to solving complex, multi-agent problems in fields like finance, robotics, and machine learning — even when the systems are huge and high-dimensional.
Primary Area: Deep Learning->Algorithms
Keywords: Optimal Stopping, Deep Learning, Mean Field Games
Submission Number: 8372
Loading