Learning Best-in-Class Policies for the Predict-then-Optimize Framework

Michael Huang; Vishal Gupta

Learning Best-in-Class Policies for the Predict-then-Optimize Framework

Michael Huang, Vishal Gupta

Published: 01 Jan 2024, Last Modified: 02 Oct 2024CoRR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. The key idea is to connect the expected downstream decision loss with the directional derivative of a particular plug-in objective, and then approximate this derivative using zeroth order gradient techniques. Unlike the original decision loss which is typically piecewise constant and discontinuous, our new PG losses can be optimized using off-the-shelf gradient-based methods. Most importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows. Hence, optimizing our surrogate loss yields a best-in-class policy asymptotically, even in misspecified settings. This is the first such result in misspecified settings, and we provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified.

Loading