Abstract: Differential privacy federated learning (DPFL) has garnered tremendous attentions for its ability to preserve clients’ privacy during model training. However, directly training multi-modal models within DPFL frameworks often results in inferior learning performance because: 1) Multi-modal imbalance, a common issue in model training, is not considered in DPFL when determining scales of artificial noises (ANs) generated by differential privacy (DP); 2) ANs will alter the impacts of individual modalities on the training process, further deteriorating the performance of multi-modal learning. In this paper, we propose a novel multi-modal differential privacy federated learning (MDPFL) framework to address these issues. To be specific, we first design a parameter-clipping method that is capable of handling heterogeneous quality of modalities. Then we theoretically analyze the influence of variations in modality quality on learning performance by deriving the upper bound of loss functions. Next, based on our analysis, we construct a heuristic criterion to effectively assess the contributions of each client’s uni-modal models (obfuscated by ANs) to the overall learning performance. We further design a modality selection algorithm to enhance learning performance by discarding modalities with low contributions (due to the influence of ANs). Extensive experimental results validate our theoretical analysis on modality contributions to the learning performance in terms of accuracy. Also, experimental results demonstrate that our parameter-clipping method tailored for the MDPFL significantly enhances the accuracy performance compared to conventional clipping method within DPFL frameworks, yielding improvements of up to 10%, and the proposed modality selection algorithm can further boost classification accuracy by 4%.
External IDs:doi:10.1109/tifs.2025.3598587
Loading