From Mean to Extreme: Formal Differential Privacy Bounds on the Success of Real-World Data Reconstruction Attacks

TMLR Paper4081 Authors

29 Jan 2025 (modified: 28 May 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Our work explores DP's role in protecting against a category of data reconstruction attacks from literature which do not rely on prior knowledge of the data, namely analytic gradient inversion attacks. Analytic gradient inversion attacks are particularly effective and difficult to detect in real-world scenarios, operating under a threat model where an adversary can manipulate the ML model before or during training \textit{without} needing any knowledge of the input data beyond its dimensionality. Our theoretical contributions include (1) formulating an optimal attack strategy under the mean squared error for the specified threat model, (2) measuring the attack's success by comparing the reconstruction to the input data using three different metrics, and (3) computing theoretical bounds for these metrics. Notably, we analyse the probabilistic behaviour of the reconstruction success from expectation (*mean*) to tail behaviour (*extreme*). Additionally, we experimentally demonstrate and visualise the validity of our optimal reconstruction strategy and highlight the relevance of our theoretical bounds by comparing them to the experimental success of the attack. Theoretically and empirically, our work underscores the protection DP provides against analytic gradient inversion attacks across varying privacy guarantees and model choices. By providing bounds for the success of data reconstruction attacks in real-world scenarios, we provide practitioners with a richer foundation for understanding specific reconstruction risks.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: **Enhanced Clarity & Organisation**: We have rewritten the Abstract and Introduction to provide a more focused narrative and clearly articulate the contributions of our work. While the key aspects remained the same, we have focused on making them more accessible. **Expanded Background Material**: We moved backgrounds which were previously in the appendix and further expanded the Related Works section to provide essential context on Differential Privacy, Reconstruction Robustness (ReRo), and Analytic Gradient Inversion Attacks. This aims to improve accessibility. **Improved Theoretical Explanations**: We have adapted the explanations surrounding our theoretical derivations, highlighting practical relevance and highlighted the main takeaways. **Strengthened Practical Implications**: We have added a section on understanding the interpretation of our results to clarify how they can be used and offer benefits for practitioners (and what they cannot do). **Refined Related Work**: The Related Work section has been thoroughly revised to better contextualise our research within the existing literature and highlight its novelty. **Added Simple Composition Theorem**: We now also provide a preliminary worst-case estimate of reconstruction risk over multiple training steps. While currently based on naive assumptions, this represents an important direction for future work and offers initial insights into the cumulative impact of repeated attacks.
Assigned Action Editor: ~Chuan_Guo1
Submission Number: 4081
Loading