Keywords: Textual Differentiation, Preference Optimization, Automatic Heuristic Design, Large Language Model
TL;DR: TPD-AHD leverages a best-anchored strategy and textual differentiation to iteratively guide LLM-based heuristic design for consistently improved performance.
Abstract: The design of effective heuristics for complex combinatorial optimization problems has traditionally relied on extensive domain expertise and manual effort. While Large Language Model-based Automated Heuristic Design (LLM-AHD) offers a promising path toward autonomous heuristic generation, existing methods often suffer from undirected search processes and poor interpretability, resulting in a black-box optimization paradigm. To address these limitations, we introduce Textual Preference Differentiation for Automatic Heuristic Design (TPD-AHD), a novel framework that integrates preference optimization with textual feedback to guide LLM-driven heuristic evolution. TPD-AHD employs a best-anchored strategy to pair heuristic candidates and generates a natural language textual loss. This loss is then translated into a textual gradient, which provides explicit, interpretable instructions for iterative heuristic refinement. This approach not only enhances the transparency of the optimization trajectory but also ensures a directed search toward high-performance regions. Extensive experiments on a suite of NP-hard combinatorial optimization problems demonstrate that TPD-AHD consistently outperforms both manually designed heuristics and existing LLM-AHD methods. Furthermore, it exhibits strong generalization capabilities across diverse domains and provides clear insights into the heuristic improvement process. TPD-AHD establishes a new paradigm for interpretable, efficient, and scalable automatic heuristic design.
Primary Area: optimization
Submission Number: 17494
Loading