FedDPSyn: Federated Tabular Data Synthesis with Computational Differential Privacy

Published: 01 May 2025, Last Modified: 07 May 2026TPDP 2025 workshopEveryoneCC BY 4.0
Abstract: We study the problem of DP data synthesis when the data is horizontally distributed over federated data owners. A line of work has investigated the generation of synthetic tabular data with differential privacy (DP) guarantees. To the best of our knowledge, they all assume a trusted centralized server that executes their DP algorithms. In the federated setting, simply concatenating synthetic data independently generated by each data owner, with the centralized methods, causes large errors in terms of accuracy. In this work, we propose FedDPSyn, a new cryptographically-based federated data synthesis framework that (1) achieves similar accuracy levels as in the centralized model,(2) allows rewriting existing centralized data synthesis algorithms. Early empirical results show that our framework only incurs∼ 20 minutes overhead at the online phase, which is reasonable in practice.
Loading