Keywords: federated Learning, fine-tuning, probabilistic masking, probabilistic filters, foundation models
Abstract: Foundation Models (FMs) have revolutionized machine learning with their adaptability and high performance across tasks; yet, their integration into Federated Learning (FL) is challenging due to substantial communication overhead from their extensive parameterization. We present DeltaMask, a novel method that efficiently fine-tunes FMs in FL at an ultra-low bitrate, well below $1$ bpp. DeltaMask employs stochastic masking to detect highly effective subnetworks within FMs and leverage stochasticity and sparsity in client masks to compress updates into a compact grayscale image using probabilistic filters, deviating from traditional weight training approaches. Our comprehensive evaluations across various datasets and architectures demonstrate DeltaMask efficiently achieves bitrates as low as $0.09$ bpp, enhancing communication efficiency while maintaining FMs performance, as measured on $8$ datasets and $5$ pre-trained models of various network architectures.
Submission Number: 5
Loading