Abstract: Sharing relational databases is essential in today’s data-driven world for fostering collaboration, enhancing efficiency, and enabling real-time data access. However, privacy and copyright issues arise when sharing privacy-sensitive or valuable data. Additionally, high utility is required in shared data to enable accurate data mining and analysis. Entry-level differentially private fingerprinting schemes (DPFS) could address these concerns. In a DPFS, data can be securely shared without leaking original values while still supporting accurate analysis. Moreover, detectable fingerprints can deter unauthorized redistribution. However, existing DPFSs often lack utility—due to format changes and entry-wise bias—or robustness, as fingerprints can be removed undetected. In this paper, we propose an unbiased and robust differential privacy-based fingerprinting scheme (DPFS), which ensures that the fingerprinted copy remains an unbiased estimate of the original data. By incorporating differential privacy noise, our scheme effectively mitigates alteration, collusion, and hybrid attacks. Our DPFS satisfies $\epsilon $ -entry-level differential privacy, enabling clients to conduct unbiased analysis. To improve robustness, we design group-based fingerprint detection, which estimates the mean of injected noise per group with error tolerance. We provide a theoretical robustness analysis and propose a method for achieving optimal robustness. Experiments on four real-world databases show that our scheme consistently detects fingerprints and improves accuracy by up to 20% on machine learning tasks compared to existing DPFSs.
External IDs:dblp:journals/tifs/WangCCQYZYCY26
Loading