Abstract: Many graph learning applications involve analyzing geometric graphs (e.g., nearest neighbor graphs over embeddings) built over sensitive data, thus requiring formal privacy protections.
In this paper, we study benchmark problems in privately analyzing geometric graphs obtained from high dimensional embeddings. We provide several new results for the differentially private approximation of minimum spanning trees and hierarchical clustering in Euclidean graphs. Our algorithms achieve a near optimal privacy-utility trade-off (up to constants), providing a $(1+\eta)$-multiplicative approximation with $\tilde{O}(\rho/\eta^2)$ additive error per edge of the tree under $\rho$-dist privacy (a generalization of DP in geometric data where neighboring datasets different in a single point moved by at most $\rho$ distance). Furthermore, we establish a separation between Euclidean and general graphs by proving a lower bound of $\Omega(\rho\sqrt{n})$ additive error per edge of the tree for general graphs under a similar privacy notion, demonstrating that better utility is indeed achievable (allowing also multiplicative approximation) for geometric data. Our algorithm can also be directly applied to widely used clustering algorithm based on MST, incurring only a small loss in the approximation guarantee compared to its non-private counterpart.
Code Dataset Promise: Yes
Code Dataset Url: https://github.com/zongruiovo/DP-MST
Signed Copyright Form: pdf
Format Confirmation: I agree that I have read and followed the formatting instructions for the camera ready version.
Code Dataset Upload: zip
Submission Number: 190
Loading