Learning to Optimize at Scale: A Benders Decomposition-TransfORmers Framework for Stochastic Combinatorial Optimization

Seung Jin Choi; Kimiya Jozani; Joshua F. Cooper; I Esra Buyuktahtakin

Learning to Optimize at Scale: A Benders Decomposition-TransfORmers Framework for Stochastic Combinatorial Optimization

Seung Jin Choi, Kimiya Jozani, Joshua F. Cooper, I Esra Buyuktahtakin

Published: 28 Nov 2025, Last Modified: 30 Nov 2025NeurIPS 2025 Workshop MLxOREveryoneRevisionsBibTeXCC BY 4.0

Keywords: Stochastic Mixed-Integer Programming, Dynamic Decomposition, Transformer Models, Capacitated Lot-Sizing, Learning-Optimization Integration

TL;DR: We develop a decomposition-transformer framework that learns to optimize large-scale stochastic combinatorial problems, achieving faster and more scalable solutions than classical methods.

Abstract: We introduce the first integration of transformer-based learning within a dynamic decomposition framework to solve large-scale two-stage stochastic mixed-integer programs. We focus on the stochastic capacitated lot-sizing problem under demand uncertainty. Our approach reformulates the problem to incorporate stochasticity and partitions the problem into smaller, tractable subproblems using a dynamic decomposition. These subproblems are solved by a transfORmer model, providing a novel combination of learned combinatorial and relaxation-based cuts to the master problem. This hybrid learning-optimization strategy bridges deep learning with exact mathematical programming, achieving fast but high-quality solutions. For the test set considered, our method outperforms traditional decomposition and direct solvers in both runtime and scalability, demonstrating the potential of transformers as surrogate optimizers embedded within structured solution algorithms for stochastic combinatorial problems.

Submission Number: 153

Loading