Byzantine-Resilient Zero-Order Optimization for Scalable Federated Fine-Tuning of Large Language Models
Keywords: Byzantine-resilience, Data heterogeneity, Federated learning, Fine-tuning, Zero-order optimization
TL;DR: We introduce a transformed robust aggregation tailored to federated zero-order optimization that provides low communication cost and byzantine-resilience under heterogeneous data distributions. We provide theoretical and empirical guarantees.
Abstract: We introduce FedByZO, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that FedByZO exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.
Submission Number: 99
Loading