Towards Memory-Efficient Foundation Models in Medical Imaging: A Federated Learning and Knowledge Distillation Approach

Afsaneh Mahanipour; Abdullah Imran; Hana Khamfroush

Towards Memory-Efficient Foundation Models in Medical Imaging: A Federated Learning and Knowledge Distillation Approach

Afsaneh Mahanipour, Abdullah Imran, Hana Khamfroush

Published: 12 Oct 2025, Last Modified: 12 Nov 2025GenAI4Health 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Federated Learning, Foundation Models, Knowledge Distillation, Medical Imaging.

Abstract: The rapid development of medical foundation models has shown great promise for various healthcare applications. However, fine-tuning these models for downstream tasks remains challenging due to privacy concerns that limit centralized data collection from diverse sources. Federated learning (FL) offers a privacy-preserving solution by enabling multiple clients to collaboratively train a global model without sharing their local data. Despite its advantages, FL must balance model performance with communication and computation costs. Existing approaches often use parameter-efficient fine-tuning (PEFT) techniques to reduce communication overhead by transmitting fewer parameters. However, these methods require clients to host large foundation models, which is impractical for clients with limited memory. Meanwhile, conventional knowledge distillation (KD) methods fall short in FL due to misalignment between pre-trained foundation models and specific downstream tasks. To overcome these limitations, we propose Federated Reprogramming Knowledge Distillation (FedRD), a method that uses lightweight student models in clients and a medical foundation model on the server. A reprogramming module aligns the foundation model's feature space with the downstream task, enabling student models to mimic this representation collaboratively. FedRD significantly reduces memory and computation requirements while maintaining high accuracy. Experiments on three medical imaging datasets under non-IID data distributions demonstrate that FedRD outperforms federated KD and PEFT methods, offering an effective trade-off between accuracy, communication, and computational efficiency.

Submission Number: 51

Loading