Automated Optimization Modeling via a Localizable Error-Driven Perspective

ICLR 2026 Conference Submission62 Authors

01 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM post-training, automated optimization modeling
Abstract: Automated optimization modeling via Large Language Models (LLMs) has emerged as a promising approach to assist complex human decision-making. While post-training has become a pivotal technique to enhance LLMs' capabilities in this domain, its effectiveness is severely constrained by the scarcity and underutilization of high-quality training data. However, through a detailed profiling of error patterns across various problem-response pairs drawn from post-training, we identify two fundamental limitations of existing automated optimization modeling approaches: (L1) the \textit{sparsity} of error-specific problems and (L2) the \textit{sparse rewards} associated with difficult problems. We demonstrate that these limitations can result in suboptimal performance in domain-specific post-training for LLMs. To tackle the above two limitations, we propose a novel error-driven learning framework---namely, auto\textbf{m}ated opt\textbf{i}mization modeli\textbf{n}g via a localizable error-\textbf{d}riven perspective (MIND)---that customizes the whole model training framework from data synthesis to post-training. MIND is based on our key observation of the unique \textbf{\textit{localizable}} patterns in error propagation of optimization modelings, that is, modeling errors may remain localized to specific semantic segments and do not propagate throughout the entire solution. Thus, in contrast to holistic reasoning tasks such as mathematical proofs, MIND leverages the construction of a focused, high-density training corpus and proposes \textbf{D}ynamic Supervised \textbf{F}ine-Tuning \textbf{P}olicy \textbf{O}ptimization (DFPO) to tackle difficult problems through localized refinement. Its appealing features include that (1) it generates targeted, error-aware training problems that achieve superior sample efficiency, and (2) it ensures a coherent and structured learning progression for stable and effective reinforcement learning on difficult problems. Experiments on six benchmarks demonstrate that MIND \textit{consistently} outperforms all the state-of-the-art automated optimization modeling approaches. Furthermore, we open-source a new training dataset, MIND-Train, and a new benchmark, MIND-Bench, for the automated optimization modeling research community.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 62
Loading