TriplePlay: Enhancing Federated Learning with CLIP for Non-IID Data and Resource Efficiency

Published: 01 Jan 2024, Last Modified: 12 Nov 2025ICMLA 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The recent advancement of pretrained models shows great potential as well as challenges for privacy-preserving distributed machine learning technique called Federated Learning (FL). With the growing demands of foundation models, it is now an urgent need to explore the potential of such foundation models in a distributed setting. In this paper, In this paper, we delve into the complexities of leveraging foundation models, like CLIP into FL frameworks to preserve data privacy, and efficiently training distributed network clients across heterogeneous data landscapes. We specifically aim to address the issues related to non-IID data distributions, skewed class representation of FL clients' local dataset, communication overhead and high resource consumption due to large, complex model training in an FL setting. To address these, we propose TriplePlay, a framework that tailors CLIP foundation model as an adapter to strengthen FL model's performance and adaptability across heterogeneous data distributions among the clients. Besides, we address the long-tail distribution problem in an FL environment to maintain fairness and optimize the computational resource demands of the FL clients through quantization and low-rank adaptation techniques. A comprehensive simulations results with two distinct datasets and different FL settings demonstrate that TriplePlay efficiently reduces GPU usage and accelerates the convergence time that ultimately reduces the communication cost.
Loading