Provable Privacy Attacks on Trained Shallow Neural Networks

Provable Privacy Attacks on Trained Shallow Neural Networks

TMLR Paper9662 Authors

11 Jun 2026 (modified: 20 Jun 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: We study what provable privacy attacks can be shown for trained 2-layer ReLU neural networks, focusing on two types of attacks: membership inference and data reconstruction. We prove that theoretical results on the implicit bias of 2-layer neural networks can be used to provably identify with high probability whether a given point was used in the training set in a high-dimensional setting, and can also be used to construct a set of which at least a constant fraction are training points in a univariate setting. To the best of our knowledge, our work is the first to show provable vulnerabilities in this implicit-bias-driven setting.

Submission Type: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=i7vGHtXJJc

Changes Since Last Submission: Fixed font to match TMLR style.

Assigned Action Editor: ~Jeremias_Sulam1

Submission Number: 9662

Loading