Spurious Privacy Leakage in Neural Networks

Chenxiang Zhang; Jun Pang; Sjouke Mauw

Spurious Privacy Leakage in Neural Networks

Chenxiang Zhang, Jun Pang, Sjouke Mauw

Published: 31 Aug 2025, Last Modified: 31 Aug 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Neural networks trained on real-world data often exhibit biases while simultaneously being vulnerable to privacy attacks aimed at extracting sensitive information. Despite extensive research on each problem individually, their intersection remains poorly understood. In this work, we investigate the privacy impact of spurious correlation bias. We introduce _spurious privacy leakage_, a phenomenon in which spurious groups are significantly more vulnerable to privacy attacks than non-spurious groups. We observe that privacy disparity between groups increases in tasks with simpler objectives (e.g. fewer classes) due to spurious features. Counterintuitively, we demonstrate that spurious robust methods, designed to reduce spurious bias, fail to mitigate privacy disparity. Our analysis reveals that this occurs because robust methods can reduce reliance on spurious features for prediction, but do not prevent their memorization during training. Finally, we systematically compare the privacy of different model architectures trained with spurious data, demonstrating that, contrary to previous work, architectural choice can affect privacy evaluation.

Submission Length: Regular submission (no more than 12 pages of main content)

Code: https://github.com/orientino/spurious-mia

Assigned Action Editor: ~Sanghyun_Hong1

Submission Number: 4898

Loading