SLEEP: Simulated Future Learning Environments for Automated Evolution of Heuristic Portfolios

Published: 16 Jun 2026, Last Modified: 24 Jun 2026ICML 2026 Workshop DL4C PosterEveryoneRevisionsBibTeXCC BY-NC 4.0
Keywords: LLMs for code generation, automatic heuristic design, continual learning
TL;DR: We introduce continual heuristic portfolio learning, a new setting where heuristic portfolios must adapt to temporally shifting optimization distributions by learning simulated future “sleep” environments for portfolio evolution.
Abstract: We introduce continual heuristic portfolio learning, a new setting of automatic heuristic design (AHD) for nonstationary combinatorial optimization, where instances arrive over time and the learner must maintain a compact set of executable heuristics for future distributions. This setting naturally leverages LLM code generation: candidate heuristics can be synthesized as programs, executed inside a solver, and improved from objective feedback. We propose SLEEP, a sleep-wake framework that learns to construct synthetic training environments for an inner automatic heuristic set design (AHSD) procedure. During sleep, SLEEP prepares simulated and replayed instances for an LLM-based evolutionary portfolio designer; during wake, real-instance performance provides delayed feedback. Rather than forecasting the next distribution explicitly, SLEEP learns environments that induce robust and complementary heuristics. We further introduce EoH-C, a counterfactual extension of EoH-S that incorporates recent wake replay during portfolio selection. Experiments on classical combinatorial optimization benchmarks, including TSP and VRP variants, demonstrate improved future-instance performance under distribution shift.
Submission Number: 119
Loading