Abstract: Large Language Models (LLMs) achieve state-
of-the-art performance but are challenging to
deploy due to their high computational and
storage demands. Pruning can reduce model
size, yet existing methods assume public ac-
cess to calibration data, which is impractical
for privacy-sensitive applications. To address
the challenge of pruning LLMs in privacy-
preserving settings, we propose FedSpaLLM,
the first federated learning framework designed
specifically for pruning LLMs. FedSpaLLM
enables clients to locally prune their models
based on private data while accounting for sys-
tem heterogeneity and maintaining communi-
cation efficiency. Our framework introduces
several key innovations: (1) a novel ℓ0-norm
aggregation function that ensures only non-
zero weights are averaged across clients, pre-
serving important model parameters; (2) an
adaptive mask expansion technique that meets
global sparsity targets while accommodating
client-specific pruning decisions; and (3) a
layer sampling strategy that reduces communi-
cation overhead and personalizes the pruning
process based on client resources. Extensive
experiments show that FedSpaLLM improves
pruning performance in diverse federated set-
tings. The source code can be found at https:
//github.com/BaiTheBest/FedSpaLLM.
Loading