Keywords: Supply Chain Optimization, Decentralized Control, Partially Ob- servable Markov Decision Processes (POMDP), Inventory Manage- ment, Multi-Agent Systems.
Abstract: We consider a multi-retailer supply chain where each retailer can dynamically choose when to share information (e.g., local inventory levels or demand observations) with other retailers, incurring a communication cost for each sharing event. This flexible information exchange mechanism contrasts with fixed protocols such as always sharing or never sharing. We formulate a joint optimization of inventory control and communication strategies, aiming to balance the trade-off between communication overhead and operational performance (service levels, holding, and stockout costs). We adopt a common information framework and derive a centralized Partially Observable Markov Decision Process (POMDP) model for a supply chain coordinator.
Solving this coordinator’s POMDP via dynamic programming characterizes the structure of optimal policies, determining when retailers should communicate and how they should adjust orders based on available information. We show that, in this setting, retailers can often act optimally by sharing only limited summaries of their private data, reducing communication frequency without compromising performance. We also incorporate practical constraints on communication frequency and propose an approximate point-based POMDP solution method (PBVI/SARSOP) to address computational complexity. Numerical experiments on multi-retailer inventory scenarios demonstrate that our approach significantly improves the cost–service trade-off compared to static information sharing policies, effectively optimizing the schedule of information exchange for cooperative inventory control.
Submission Number: 1
Loading