Abstract: Existing Federated Learning (FL) algorithms generally suffer from high communication costs and data heterogeneity due to the use of conventional loss function for local model update and the equal consideration of each local model for global model aggregation. In this article, we propose a novel FL approach to address the above issues. For local model update, we propose a disentangled Information Bottleneck (IB) principle-based loss function. For global model aggregation, we suggest a model selection strategy based on Mutual Information (MI). Particularly, we design a Lagrangian-based loss function using the IB principle and “disentanglement” for maximizing MI between the ground truth and model prediction and minimizing MI between the intermediate representations. We calculate MI ratio between the ground truth and model prediction, and between the original input and ground truth to select the effective models for aggregation. We analyze the theoretical optimal cost of the loss function and manifest optimal convergence rate, and quantify the outlier robustness of the aggregation scheme. Experiments demonstrate the superiority of the proposed FL approach, in terms of testing performance and communication speedup (i.e., 3.00-14.88 times for IID MNIST, 2.5-50.75 times for non-IID MNIST, 1.87-18.40 times for IID CIFAR-10, and 1.24-2.10 times for non-IID MIMIC-III).
0 Replies
Loading