Byzantine-Resilient Zero-Order Optimization for Scalable Federated Fine-Tuning of Large Language Models

Maximilian Egger; Mayank Bakshi; Rawad Bitar

Byzantine-Resilient Zero-Order Optimization for Scalable Federated Fine-Tuning of Large Language Models

Maximilian Egger, Mayank Bakshi, Rawad Bitar

Published: 11 Jun 2025, Last Modified: 10 Jul 2025ES-FoMo IIIEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Byzantine-resilience, Data heterogeneity, Federated learning, Fine-tuning, Zero-order optimization

TL;DR: We introduce a transformed robust aggregation tailored to federated zero-order optimization that provides low communication cost and byzantine-resilience under heterogeneous data distributions. We provide theoretical and empirical guarantees.

Abstract: We introduce FedByZO, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that FedByZO exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.

Submission Number: 99

Loading