Prescribe-then-Select: Adaptive Policy Selection for Contextual Stochastic Optimization

TMLR Paper5821 Authors

05 Sept 2025 (modified: 04 Dec 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We address the problem of policy selection in contextual stochastic optimization (CSO), where covariates are available as contextual information and decisions must satisfy hard feasibility constraints. In many CSO settings, multiple candidate policies—arising from different modeling paradigms—exhibit heterogeneous performance across the covariate space, with no single policy uniformly dominating. We propose Prescribe-then-Select (PS), a modular framework that first constructs a library of feasible candidate policies and then learns a meta-policy to select the best policy for the observed covariates. We implement the meta-policy using ensembles of Optimal Policy Trees trained via cross-validation on the training set, making policy choice entirely data-driven. Across two benchmark CSO problems—single-stage newsvendor and two-stage shipment planning—PS consistently outperforms the best single policy in heterogeneous regimes of the covariate space and converges to the dominant policy when such heterogeneity is absent. All the code to reproduce the results can be found at https://anonymous.4open.science/r/Prescribe-then-Select-TMLR.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Minor changes addressing new reviewer comment.
Assigned Action Editor: ~Reza_Babanezhad_Harikandeh1
Submission Number: 5821
Loading