Private Zeroth-Order Optimization with Public Data

Published: 11 Jun 2025, Last Modified: 10 Jul 2025ES-FoMo IIIEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Zeroth-Order Optimization, Differential Privacy
Abstract: One of the major bottlenecks for deploying popular first-order differentially private (DP) machine learning algorithms (e.g., DP-SGD) lies in their high computation and memory cost, despite the existence of optimized implementations. Zeroth-order methods have promise in mitigating the overhead, as they leverage function evaluations to approximate the gradients, hence significantly easier to privatize. While recent works have explored zeroth-order approaches in both private and non-private settings, they still suffer from relatively low utilities compared with DP-SGD and limited application domains. In this work, we propose to leverage public information to guide and improve gradient approximation of private zeroth-order algorithms. We explore a suite of \underline{p}ublic data \underline{a}ssisted \underline{z}eroth-\underline{o}rder optimizers (PAZO) with minimal overhead. We provide theoretical analyses of the PAZO framework under an assumption of the similarity between public and private data. Empirically, we demonstrate that PAZO achieves stronger privacy/utility tradeoffs across vision and text tasks in both pre-training and fine-tuning regimes, outperforming the best first-order baselines (with public gradients) especially in highly private regimes, while offering up to runtime speedup.
Submission Number: 93
Loading