Sobolev Training of End-to-End Optimization Proxies

Andrew W. Rosemberg; Joaquim Dias Garcia; Russell Bent; Pascal Van Hentenryck

Sobolev Training of End-to-End Optimization Proxies

Andrew W. Rosemberg, Joaquim Dias Garcia, Russell Bent, Pascal Van Hentenryck

08 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Differentiable programming; Machine learning surrogate; Sobolev training; Multi-level Decision Making

TL;DR: Sobolev training—supervised or self-supervised—makes fast ML surrogates more accurate and reliable for large, safety-critical optimization tasks.

Abstract: Optimization proxies—machine-learning models trained to approximate the solution mapping of parametric optimization problems in a single forward pass—offer dramatic reductions in inference time compared to traditional iterative solvers. This work investigates the integration of solver sensitivities into such end-to-end proxies via a Sobolev–training paradigm and does so in \emph{two distinct settings}: (i) \emph{fully supervised} proxies, where exact solver outputs and sensitivities are available, and (ii) \emph{self-supervised} proxies that rely only on the objective and constraint structure of the underlying optimization problem. By augmenting the standard training loss with directional-derivative information extracted from the solver, the proxy aligns both its predicted solutions \emph{and} local derivatives with those of the optimizer. Under Lipschitz-continuity assumptions on the true solution mapping, matching first-order sensitivities is shown to yield uniform approximation error proportional to the training-set covering radius. Empirically, different impacts are observed in each studied setting. On three large Alternating Current Optimal Power Flow benchmarks, supervised Sobolev training cuts mean-squared error by up to 56 \% and the median worst-case constraint violation by up to 400 \% while keeping the optimality gap below 0.22 \%. For a mean–variance portfolio task trained without labeled solutions, self-supervised Sobolev training halves the average optimality gap in the medium-risk region (i.e. standard deviation above $10\%$ of budget) and matches the baseline elsewhere. Together, these results highlight Sobolev training—whether supervised or self-supervised—as a path to fast, reliable surrogates for safety-critical, large-scale optimization workloads.

Primary Area: optimization

Submission Number: 3183

Loading