Evo-Step: Evolutionary Generation and Stepwise Validation for Optimizing LLMs in OR

ICLR 2025 Conference Submission13668 Authors

28 Sept 2024 (modified: 13 Oct 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large language model; Operations Research; Automated Modeling
Abstract: Large Language Models (LLMs) have revolutionized various domains, but they face challenges when applied to highly specialized fields such as Operations Research (OR). In this work, we present Evo-Step-Instruct, a novel framework that progressively increases the complexity of generated problems using an evolutionary strategy, aimed at enhancing the capabilities of LLMs in optimization modeling. Our framework integrates stepwise validation, which ensures real-time error detection and correction during data generation, thereby improving data quality and preventing error propagation. We fine-tune open-source LLMs, such as LLaMA-3-8B and Mistral-7B, using the generated high-quality dataset, resulting in a model, Evo-Step, that significantly outperforms baseline approaches on key benchmarks including NL4OPT, MAMO, and IndustryOR. Through extensive experiments, Evo-Step demonstrates superior performance, especially in handling complex OR tasks, achieving a notable improvement of 17.01% in micro average accuracy on difficult problems. Our approach represents a substantial advancement in automating complex decision-making processes using LLM, showcasing the potential of combining evolutionary problem generation with structured validation for fine-tuning LLMs.
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13668
Loading