Efficient MIP-LP Gap Mitigation for Predict+Optimize

Efficient MIP-LP Gap Mitigation for Predict+Optimize

ICLR 2026 Conference Submission14527 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Discrete Optimization, Predict+Optimize, Decision-Focused Learning

Abstract: The Predict+Optimize (P+O) paradigm seeks to train prediction models for unknown parameters in optimization problems, with the goal of yielding good optimization solutions downstream. Prior works have proposed strategies for gradient computation in neural network training, when the downstream optimization is a linear program (LP). Yet, in face of mixed-integer linear programs (MIP), much prior work simply relax the MIP into an LP, resulting in sub-optimally trained predictors. The issue is particularly stark in the recent Two-Stage Predict+Optimize framework, where even the MIP constraints can contain uncertainty. In this work, we propose a (shockingly) simple and fast approach for addressing the MIP-LP gap, and show that it yields essentially the same or more accuracy gains over a much slower method adapted from prior work. Concretely, for the latter, we adapt the approach of MIPaaL (Ferber et al. (2020)) and introduce cutting planes into the LP relaxation, before using LP-based gradient computation methods. Such adaptation is slow and requires some work for the new Two-Stage P+O setting, given the constantly-changing constraint predictions during training. We instead propose and advocate for a far simpler method: replace the relaxed-LP optimum in the LP-based gradient computation with the actual true MIP optimum, avoiding the repeated use of (slow) cutting plane MIP solvers in the slow method. Experimental results on 3 benchmarks show that this simple strategy yields the same or more accuracy gain over the much slower cutting plane approach, and the conjunctive use of the two methods yields only minor further gains at the expense of vastly increased training time, sometimes by a whole order of magnitude.

Supplementary Material: zip

Primary Area: optimization

Submission Number: 14527

Loading