Silver Stepsize for Faster Zeroth-Order Optimization

ICLR 2026 Conference Submission25582 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Zeroth-Order Optimization, Silver Stepsize, Gradient-Free
Abstract: We study gradient-free minimization of smooth convex functions through Silver stepsizes—a non-monotone, 2-adic schedule that accelerates gradient descent—and show how to compose it with two-point zeroth-order (ZO) estimators on a smoothed objective. We apply Silver’s multi‑step Lyapunov analysis to smoothed objectives and show that it carries over verbatim when gradients are replaced by unbiased two‑point estimators with a tax in the form of a quadratic variance term. We control this term via an orthogonal-on-spikes batching policy that allocates directions proportionally to the Silver steps (with a cap at dimension), achieving budget-optimal variance aggregation. Empirically, we validate our approach through both numerical experiments and MeZO-style forward-pass-only fine-tuning of large language models, incorporating practical considerations such as clipping strategies, and demonstrate its superior performance.
Primary Area: optimization
Submission Number: 25582
Loading