Enhancing Zero-Shot Black-Box Optimization via Pretrained Models with Efficient Population Modeling, Interaction, and Stable Gradient Approximation

Muqi Han; Xiaobin Li; Kai Wu; Xiaoyu Zhang; Handing Wang

Enhancing Zero-Shot Black-Box Optimization via Pretrained Models with Efficient Population Modeling, Interaction, and Stable Gradient Approximation

Muqi Han, Xiaobin Li, Kai Wu, Xiaoyu Zhang, Handing Wang

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Meta Black-box Optimization, Black-box optimization, Pretrained Optimization Model, Neural Combinatorial Optimization

Abstract: Zero-shot optimization aims to achieve both generalization and performance gains on solving previously unseen black-box optimization problems over SOTA methods without task-specific tuning. Pre-trained optimization models (POMs) address this challenge by learning a general mapping from task features to optimization strategies, enabling direct deployment on new tasks. In this paper, we identify three essential components that determine the effectiveness of POMs: (1) task feature modeling, which captures structural properties of optimization problems; (2) optimization strategy representation, which defines how new candidate solutions are generated; and (3) the feature-to-strategy mapping mechanism learned during pre-training. However, existing POMs often suffer from weak feature representations, rigid strategy modeling, and unstable training. To address these limitations, we propose EPOM, an enhanced framework for pre-trained optimization. EPOM enriches task representations using a cross-attention-based tokenizer, improves strategy diversity through deformable attention, and stabilizes training by replacing non-differentiable operations with a differentiable crossover mechanism. Together, these enhancements yield better generalization, faster convergence, and more reliable performance in zero-shot black-box optimization.

Primary Area: Optimization (e.g., convex and non-convex, stochastic, robust)

Submission Number: 9256

Loading