Keywords: Mixed-Integer Linear Programming, Combinatorial Optimization, MILP Instance Generation
Abstract: Efficient and controllable data generation is critical for improving the performance of data-driven Mixed-Integer Linear Programming (MILP) solvers, especially in applications facing data scarcity. However, existing MILP instance generation methods typically require training a separate model for each problem class, which can be computationally intensive and does not allow for the generation of instances with varying sizes and solution difficulties. To address these challenges, we introduce MILP-Retrieval, a framework for targeted MILP instance generation via formulation code retrieval. We first build a diverse MILP library that includes multiple modalities and use it to pretrain an MILP embedding model. Based on the output of this embedding model, we propose a novel similarity metric that accurately measures the similarity between instances of different sizes within the same problem class. MILP-Retrieval leverages this new metric to retrieve the formulation code of a target instance and further tune it. Experimental results demonstrate the effectiveness of generating MILP instances through formulation code retrieval, with the ability to control both the scale and difficulty of the generated instances. This approach provides a novel perspective on MILP instance generation and opens up new possibilities for learning-based solvers.
Primary Area: optimization
Submission Number: 12955
Loading