Wait-Less Offline Tuning and Re-solving for Online Decision Making

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0
Abstract: Online linear programming (OLP) has found broad applications in revenue management and resource allocation. State-of-the-art OLP algorithms achieve low regret by repeatedly solving linear programming (LP) subproblems that incorporate updated resource information. However, LP-based methods are computationally expensive and often inefficient for large-scale applications. By contrast, recent first-order OLP algorithms are more computationally efficient but typically suffer from weaker regret guarantees. To address these shortcomings, we propose a new algorithm that combines the strengths of LP-based and first-order OLP algorithms. Our algorithm re-solves the LP subproblems periodically at a predefined frequency $f$ and uses the latest dual prices to guide online decision-making. In parallel, a first-order method runs during each interval between LP re-solves and smooths resource consumption. Our algorithm achieves $\mathcal{O}(\log (T/f) + \sqrt{f})$ regret and delivers a "wait-less" online decision-making process that balances computational efficiency and regret guarantees. Extensive experiments demonstrate at least $10$-fold improvements in regret over first-order methods and $100$-fold improvements in runtime over LP-based methods.
Lay Summary: Many industries today, such as airlines, retail, and energy, must make rapid decisions to allocate limited resources when customer requests arrive one by one. This challenge can be modeled as an Online Linear Programming (OLP) problem. Existing OLP algorithms either make high-quality decisions at the cost of slow computation or provide faster solutions with reduced accuracy. Our research bridges this trade-off by combining both approaches: we periodically use the accurate but slow method to guide decisions, and in between, we apply a faster method to make quick updates. This hybrid design shifts most of the heavy computation offline, enabling real-time decisions with minimal delay. In this way, our algorithm balances decision quality and efficiency. Experiments demonstrate this result by showing over $10$-fold improvement in decision quality compared to the fast method and over $100$-fold computation time with almost the same decision quality compared to the accurate method. This work provides a new methodology that enables real-world systems to balance speed and decision quality effectively.
Link To Code: https://github.com/Jingruo/Wait-Less-Online-Decision-Making
Primary Area: General Machine Learning->Online Learning, Active Learning and Bandits
Keywords: Online Linear Programming, Sequential Decision Making, Resource Allocation
Submission Number: 7701
Loading