DiMa: Understanding the Hardness of Online Matching Problems via Diffusion Models

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: We explore the potential of \emph{AI-enhanced combinatorial optimization theory}, taking online bipartite matching (OBM) as a case study. In the theoretical study of OBM, the \emph{hardness} corresponds to a performance \emph{upper bound} of a specific online algorithm or any possible online algorithms. Typically, these upper bounds derive from challenging instances meticulously designed by theoretical computer scientists. Zhang et al. (ICML 2024) recently provide an example demonstrating how reinforcement learning techniques enhance the hardness result of a specific OBM model. Their attempt is inspiring but preliminary. It is unclear whether their methods can be applied to other OBM problems with similar breakthroughs. This paper takes a further step by introducing DiMa, a unified and novel framework that aims at understanding the hardness of OBM problems based on denoising diffusion probabilistic models (DDPMs). DiMa models the process of generating hard instances as denoising steps, and optimizes them by a novel reinforcement learning algorithm, named \emph{shortcut policy gradient} (SPG). We first examine DiMa on the classic OBM problem by reproducing its known hardest input instance in literature. Further, we apply DiMa to two well-known variants of OBM, for which the exact hardness remains an open problem, and we successfully improve their theoretical state-of-the-art upper bounds.
Lay Summary: In this paper, we explore the potential of AI-enhanced combinatorial optimization theory, taking online bipartite matching (OBM) as a case study. OBM is a fundamental problem in theoretical computer science (TCS), whose goal is to find a maximum matching on a gradually-released bipartite graph instance, such as matching as many drivers to riders as possible in real-time. In the theoretical study of OBM, the hardness corresponds to the performance of algorithms. Typically, these bottlenecks are derived from some challenging instances meticulously designed by theoretical computer scientists. We question whether AI could assist in reproducing (or even improving) these bottlenecks, in a way of automatically generating novel harder instances with minimal human expertise. We introduce DiMa, a diffusion-based framework that models the process of generating hard instances as denoising steps, and optimizes them by reinforcement learning. Dima successfully reproduces the known hardness of the classic OBM, and further improves the state-of-the-art for two well-known variants of OBM. We believe DiMa's great potential in AI-assisted TCS and may inspire interesting future works in other fields of TCS, such as approximation algorithms or algorithmic game theory.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Applications->Everything Else
Keywords: Online Matching, Diffusion Model, Reinforcement Learning
Submission Number: 6427
Loading