Keywords: Learning to Optimize, Machine Learning for Combinatorial Optimization, Mixed-Integer Linear Programming
TL;DR: We propose a hierarchical preference optimization for MILP solving.
Abstract: Mixed-integer linear programming (MILP) is a fundamental yet computationally challenging optimization problem in operations research.
To accelerate the solving process, recent machine learning methods predict an initial solution and confine the subsequent search to a local trust region.
However, these models face two critical challenges during training.
First, the models are typically trained on a collection of high-quality solutions weighted by their objective values, which fails to account for a solution's distant to the near-optimal region and leads to a biased training signal.
Second, weighting by objective value provides an ambiguous preference signal, which prevents the model from learning to explicitly distinguish between high-quality and local optimal solutions.
To address the challenges, we introduce HiPO-MILP, a novel Hierarchical Preference Optimization framework.
Our key idea is to define a quality score for each solution that combines its objective value with its distance to the convex hull of optimal solutions.
Based on this score, HiPO-MILP constructs a three-tiered preference hierarchy that distinguishes between near-optimal, high-quality, and perturbed solutions, thereby providing a clear and robust learning signal.
By training with explicit preference pairs derived from this hierarchy, HiPO-MILP learns to navigate the solution space towards regions that are not only high-scoring but also structurally closer to the global optimum.
Experiments demonstrate that HiPO-MILP substantially improves solving efficiency across a diverse range of MILP benchmarks.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 13140
Loading