Joint Pricing and Resource Allocation: An Optimal Online-Learning Approach

Published: 28 Nov 2025, Last Modified: 30 Nov 2025NeurIPS 2025 Workshop MLxOREveryoneRevisionsBibTeXCC BY 4.0
Keywords: Online learning, Dynamic Pricing, Resource Allocation
TL;DR: We study a two-stage pricing-and-allocation joint decision problem, presenting an online-learning algorithm that achieves $\tilde{O}(\sqrt{T})$ optimal regret.
Abstract: We study an online learning problem on dynamic pricing and resource allocation, where we make joint pricing and inventory decisions to maximize the overall net profit. We consider the stochastic dependence of demands on the price, which complicates the resource allocation process and introduces significant non-convexity and non-smoothness to the problem. To solve this problem, we develop an efficient algorithm that utilizes a "Lower-Confidence Bound (LCB)" meta-strategy over multiple OCO agents. Our algorithm achieves $\tilde{O}(\sqrt{Tmn})$ regret (for $m$ suppliers and $n$ consumers), which is *optimal* with respect to the time horizon $T$. Our results illustrate an effective integration of statistical learning methodologies with complex operations research problems.
Submission Number: 47
Loading