Hessian-Dependent Sample Complexity in Zeroth-Order Stochastic Optimization: Nonconvex Support Sampling Is Necessary for Optimality
Keywords: Stochastic Zeroth-Order Optimization, Instance-Dependent Rates, Lipschitz Hessian, Non-Convex Sampling, Adaptive Sampling, Spectral Methods in Optimization, Bandit Convex Optimization
TL;DR: We show that achieving optimal, instance-dependent rates for smooth convex functions requires a non-convex sampling geometry, a departure from all prior methods
Abstract: Zeroth-order stochastic optimization is fundamental formulation that arises in real-world design problems where gradients are inaccessible.A central challenge in this pursuit is to design gradient estimators and optimization algorithms under noisy, function-only feedback that leverages the local Hessian-based geometry to achieve optimal sample efficiency.We introduce the Spectrally Grouped Estimator, a novel gradient-estimation scheme that samples over a non-convex set formed by a union of sphere sections, and utilize it to build an algorithm that achieves order-wise improved Hessian-dependent simple regrets over second-order smooth, strongly convex functions compared to ones based on conventional baseline methods which sample over convex sets.We complement these results with the first tight analyses of the baseline schemes, formally demonstrating their shared performance bottleneck and thus emphasizing the necessity of non-convex sampling for optimality. We conjecture that our achieved Hessian-dependent rates are universally-optimal.
Submission Number: 61
Loading