Achieving First-Order Statistical Improvements in Data-Driven Optimization

Published: 28 Nov 2025, Last Modified: 30 Nov 2025NeurIPS 2025 Workshop MLxOREveryoneRevisionsBibTeXCC BY 4.0
Keywords: excess risk, first-order improvement, data-driven optimization
TL;DR: We analyze the statistical performance of robust methods and find that the performance improvement over the empirical solution is usually not significant.
Abstract: Recent proliferation of data-optimization integration has led to a range of methods that aim to improve the statistical performance of data-driven optimization decisions. However, while many of these methods are motivated intuitively from a robustness or regularization perspective, their resulting statistical benefits are often less clear and, even if available, are argued in a case-by-case fashion. We provide a systematic dissection of data-driven optimization formulations using the view of ``directionally perturbed'' empirical optimization, which demonstrably covers most of the existing formulations. On the negative side, we argue that under mild smoothness conditions, any such formulations can result in at best second-order improvements. On the positive side, we show that in the presence of auxiliary information such as the availability of additional unsupervised data, we can construct a principled methodology, by building connections to the concept of Monte Carlo control variate, to achieve general first-order improvements in terms of excess risk.
Submission Number: 163
Loading