Abstract: Membership Inference Attacks (MIAs) aim to identify specific data samples within the private
training dataset of machine learning models. Many practical black-box MIAs require query
access to the data distribution to train shadow models. Prior literature presents bounds
for the adversary’s success by making connections to overfitting (and its connections to
differential privacy), noting that overfit models with high generalization error are more
susceptible to attacks. However, overfitting does not fully account for privacy risks in
models that generalize well. We take a complementary approach: by observing that label
memorization can be reduced to membership inference, we are able to present theoretical
scenarios where the adversary will always successfully (i.e., with extremely high advantage)
launch an MIA. We proceed to show that these attacks can be launched at a fraction of the
cost of state-of-the-art attacks. We confirm our theoretical arguments with comprehensive
experiments; by utilizing samples with high memorization scores, the adversary can (a)
significantly improve its efficacy regardless of the MIA used, and (b) reduce the number of
shadow models by nearly two orders of magnitude compared to state-of-the-art approaches.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Jonathan_Ullman1
Submission Number: 4930
Loading