HiPO-MILP: Hierarchical Preference Optimization for MILP Solving

ICLR 2026 Conference Submission13140 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Learning to Optimize, Machine Learning for Combinatorial Optimization, Mixed-Integer Linear Programming
TL;DR: We propose a hierarchical preference optimization for MILP solving.
Abstract: Mixed-integer linear programming (MILP) is a fundamental yet computationally challenging optimization problem in operations research. To accelerate the solving process, recent machine learning methods predict an initial solution and confine the subsequent search to a local trust region. However, these models face two critical challenges during training. First, the models are typically trained on a collection of high-quality solutions weighted by their objective values, which fails to account for a solution's distant to the near-optimal region and leads to a biased training signal. Second, weighting by objective value provides an ambiguous preference signal, which prevents the model from learning to explicitly distinguish between high-quality and local optimal solutions. To address the challenges, we introduce HiPO-MILP, a novel Hierarchical Preference Optimization framework. Our key idea is to define a quality score for each solution that combines its objective value with its distance to the convex hull of optimal solutions. Based on this score, HiPO-MILP constructs a three-tiered preference hierarchy that distinguishes between near-optimal, high-quality, and perturbed solutions, thereby providing a clear and robust learning signal. By training with explicit preference pairs derived from this hierarchy, HiPO-MILP learns to navigate the solution space towards regions that are not only high-scoring but also structurally closer to the global optimum. Experiments demonstrate that HiPO-MILP substantially improves solving efficiency across a diverse range of MILP benchmarks.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 13140
Loading