Reinforcement Learning for Admission Control in Two-Sided Queueing Systems

Matthew Sheldon; Giuliano Casale

Reinforcement Learning for Admission Control in Two-Sided Queueing Systems

Matthew Sheldon, Giuliano Casale

18 Sept 2025 (modified: 28 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Queueing Theory, Admission Control, Two-sided Markets

Abstract: Two-sided queues are a useful formalism for modeling two-sided markets, as well as more general systems in which work is conserved. Furthermore, in practical applications the arrival rate of different entities is often unknown, and may vary based on the state. General-purpose reinforcement learning algorithms may struggle at scale due to the dependency on the diameter of the Markov Decision Process (MDP), which often scales exponentially over the state space in queueing systems. To solve these issues, we present an algorithm with a diameter-independent regret bound, for the problem of admission control in a two-sided queue. Where $S$ is the size of the state space, $N$ is the number of types, $T$ is the number of steps and $\kappa$ is the ratio between the upper and lower rate bounds, our algorithm can be shown to have a regret bound of $\tilde{O}(\kappa^{3} S^{1.5} \sqrt{T}+\kappa^{2.5} S^{1.5} \sqrt{NT})$. We then show that this can significantly outperform general-purpose algorithms in an empirical study.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 11245

Loading