Keywords: Diffusion Models, Kernel Density Estimation, Black-Box Optimization
TL;DR: We propose a Support-Proximity Augmented Diffusion Estimation method for Offline Black-Box Optimization.
Abstract: Offline black-box optimization aims to discover novel designs with high property scores using only a static dataset, a task fundamentally challenged by the out-of-distribution (OOD) extrapolation problem. Existing approaches typically bifurcate into inverse methods, which struggle with the ill-posed nature of mapping scores to designs, and forward methods, which often lack the distributional expressivity to quantify uncertainty effectively. In this work, we propose \textbf{SPADE} (\textbf{S}upport-\textbf{P}roximity \textbf{A}ugmented \textbf{D}iffusion \textbf{E}stimation), a novel framework that reimagines forward surrogate modeling through the lens of conditional generative modeling. SPADE models the forward likelihood $p(y|\boldsymbol{x})$ using a diffusion model, but with two critical enhancements to tailor it for optimization: (1) a \emph{Calibrated Diffusion Estimation} module that enforces global consistency in statistical moments and pairwise rankings, and (2) a \emph{Support-Proximity Regularization} mechanism that implicitly internalizes the data manifold constraint $p(\boldsymbol{x})$ via kNN-based density estimation. Theoretically, we prove that our regularization is first-order equivalent to maximizing a Bayesian posterior with a valid design prior. Empirically, SPADE achieves state-of-the-art performance across Design-Bench tasks and an LLM data mixture optimization benchmark. Our code is available at \href{https://github.com/HarryYoung2018/spade}{https://github.com/HarryYoung2018/spade}.
Submission Number: 54
Loading