Improving Generalization and Personalization in Long-Tailed Federated Learning via Classifier Retraining
Abstract: Extensive research has been dedicated to studying the substantial challenge posed by non-IID data, which hinders the performance of federated learning (FL), a popular distributed learning paradigm. However, a notable challenge encountered by current FL algorithms in real-world applications is the presence of long-tailed data distributions. This issue often results in inadequate model accuracy when dealing with rare but crucial classes in classification tasks. To cope with this, recent studies have proposed various classifier retraining (CR) approaches. Though effective, they lack a deep understanding of how these methods affect the classifier’s performance. In this work, we first present a systematic study informed by mutual information indicators in FL. Based on this study, we propose a novel and effective CR method for FL scenarios, coined CRFDC, to address non-IID and long-tailed data challenges. Extensive experiments on standard FL benchmarks show that CRFDC can improve the model accuracy by up to 8.16% in generalization and 10.02% in personalization, as compared to the state-of-the-art approaches. The code is available at https://github.com/harrylee999/CRFDC.
Loading