A Privacy-Preserving and Unified Federated Learning Framework for Trajectory Data Preparation

Zhihao Zeng; Ziquan Fang; Wei Shao; Lu Chen; Yunjun Gao

A Privacy-Preserving and Unified Federated Learning Framework for Trajectory Data Preparation

Zhihao Zeng, Ziquan Fang, Wei Shao, Lu Chen, Yunjun Gao

08 May 2025 (modified: 29 Oct 2025)Submitted to NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Trajectory Data Preparation, Federated Learning, Large Language Model

Abstract: Trajectory data, which captures the movement patterns of people and vehicles over time and space, is crucial for applications such as traffic optimization and urban planning. However, issues such as noise and incompleteness often compromise data quality, leading to inaccurate trajectory analyses and limiting the potential of these applications. While Trajectory Data Preparation (TDP) can enhance data quality, existing methods suffer from two key limitations: (i) they do not address data privacy concerns, particularly in federated settings where trajectory data sharing is prohibited, and (ii) they typically design task-specific models that lack generalizability across diverse TDP scenarios. To overcome these challenges, we propose FedTDP, a privacy-preserving and unified framework that leverages the multi-task learning capabilities of Large Language Models (LLMs) for TDP in federated environments. Specifically, we: (i) design a trajectory privacy autoencoder for secure data transmission to protect data privacy with theoretical analysis, (ii) introduce a trajectory knowledge enhancer to develop TDP-oriented LLMs by improving model learning of TDP knowledge, and (iii) propose federated parallel optimization to enhance training efficiency by reducing data transmission and enabling parallel model training. Experiments on 6 real datasets and 10 mainstream TDP tasks demonstrate that FedTDP consistently outperforms 13 state-of-the-art baselines. All code and data are available at https://anonymous.4open.science/r/FedTDP.

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 10076

Loading