Ontology- and LLM-based Data Harmonization for Federated Learning in Healthcare

Published: 01 Jan 2025, Last Modified: 30 Jul 2025CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The rise of electronic health records (EHRs) has unlocked new opportunities for medical research, but privacy regulations and data heterogeneity remain key barriers to large-scale machine learning. Federated learning (FL) enables collaborative modeling without sharing raw data, yet faces challenges in harmonizing diverse clinical datasets. This paper presents a two-step data alignment strategy integrating ontologies and large language models (LLMs) to support secure, privacy-preserving FL in healthcare, demonstrating its effectiveness in a real-world project involving semantic mapping of EHR data.
Loading