Abstract: Traditional federated learning mainly focuses on parallel settings (PFL), which can suffer significant communication and computation costs. In contrast, one-shot and sequential federated learning (SFL) have emerged as innovative paradigms to alleviate these costs. However, the issue of non-IID (independent and identically distributed) data persists as a significant challenge in one-shot and SFL settings, exacerbated by the restricted communication between clients. In this paper, we improve the one-shot sequential federated learning for non-IID data by proposing a local model diversity-enhancing strategy. Specifically, to leverage the potential of local model diversity for improving model performance, we introduce a local model pool for each client that comprises diverse models generated during local training, and propose two distance measurements to further enhance the model diversity and mitigate the effect of non-IID data. Consequently, our proposed framework can improve the global model performance while maintaining low communication costs. Extensive experiments demonstrate that our method exhibits superior performance to existing one-shot PFL methods and achieves better accuracy compared with state-of-the-art one-shot SFL methods on both label-skew and domain-shift tasks (e.g., 6\%+ accuracy improvement on the CIFAR-10 dataset).
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: Federated learning is inherently related to multimedia and multimodal applications, particularly in addressing domain-shift issues in images, due to its distributed learning approach and non-IID (Independent and Identically Distributed) data distribution. In federated learning, models are trained across multiple devices or servers holding local data samples, without exchanging them to other clients or servers. This approach is pivotal for multimedia content as it allows for the incorporation of diverse data sources, each potentially capturing different modalities (images with different styles, such as the PACS dataset we used in our experiments that contains images with four styles: painting, art, cartoon and sketch) or variations within the non-iid data distribution (e.g., medical images from various hospitals). Our work deals with the domain-shift multimodal tasks under the one-shot sequential federated learning setting. Therefore, this work is related to multimedia/multimodal processing.
Supplementary Material: zip
Submission Number: 2328
Loading