Keywords: Discrete Optimization, Predict+Optimize, Decision-Focused Learning
Abstract: The Predict+Optimize (P+O) paradigm seeks to train prediction models for unknown parameters in optimization problems, with the goal of yielding good optimization solutions downstream. Prior works have proposed strategies for gradient computation in neural network training, when the downstream optimization is a linear program (LP). Yet, in face of mixed-integer linear programs (MIP), much prior work simply relax the MIP into an LP, resulting in sub-optimally trained predictors. The issue is particularly stark in the recent Two-Stage Predict+Optimize framework, where even the MIP constraints can contain uncertainty.
In this work, we propose a (shockingly) simple and fast approach for addressing the MIP-LP gap, and show that it yields essentially the same or more accuracy gains over a much slower method adapted from prior work. Concretely, for the latter, we adapt the approach of MIPaaL (Ferber et al. (2020)) and introduce cutting planes into the LP relaxation, before using LP-based gradient computation methods. Such adaptation is slow and requires some work for the new Two-Stage P+O setting, given the constantly-changing constraint predictions during training. We instead propose and advocate for a far simpler method: replace the relaxed-LP optimum in the LP-based gradient computation with the actual true MIP optimum, avoiding the repeated use of (slow) cutting plane MIP solvers in the slow method.
Experimental results on 3 benchmarks show that this simple strategy yields the same or more accuracy gain over the much slower cutting plane approach, and the conjunctive use of the two methods yields only minor further gains at the expense of vastly increased training time, sometimes by a whole order of magnitude.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 14527
Loading