MaskCO: Masked Generation Drives Effective Representation Learning and Exploiting for Combinatorial Optimization

ICLR 2026 Conference Submission15615 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Neural Combinatorial Optimization, Masked Generation
Abstract: Neural combinatorial optimization (NCO) has long been anchored in paradigms like solution construction or improvement that treat the solution as a monolithic reference, squandering the rich local decision patterns embedded in high-quality solutions. Inspired by self-supervised pretraining breakthroughs in language and vision, where simple yet powerful paradigms like next-token prediction enable scalable learning, we ask: \textit{Can combinatorial optimization adopt such a fundamental training paradigm to enable effective and scalable representation learning?} We introduce MaskCO, a masked generation approach that reframes learning to optimize as solution-level self-supervised learning on given reference solutions. By strategically masking portions of optimal solutions and training models to recover the missing content, MaskCO turns a single (instance, solution) pair into hundreds of (instance, partial solution) pairs, encouraging the model to internalize fine-grained, localized decision patterns. For inference, we propose a mask-and-reconstruct procedure that naturally leverages the training objective to implement a local-search-like refinement: each iteration masks certain variables and reconstructs through masked generation, progressively improving the current solution. We also find that the learned representations readily transfer to alternative inference routines and facilitate effective fine-tuning in other models. Experimental results demonstrate that masked generation serves as a universal learning objective across multiple CO problems, redefining how solutions are learned, refined, and scaled. Compared to previous state-of-the-art neural solvers, MaskCO achieves remarkable performance improvements, exceeding 99\% in optimality gap reduction, along with a 10x speedup on the Travelling Salesman Problem (TSP).
Primary Area: optimization
Submission Number: 15615
Loading