Keywords: Privacy, Membership inference attacks
TL;DR: An MIA evaluation setup for estimating privacy risk specific to a record and model release
Abstract: Record-specific Membership Inference Attacks (MIAs) are widely used to evaluate the propensity of a machine learning (ML) model to memorize an individual record and the privacy risk its release therefore poses. Record-specific MIAs are currently evaluated the same way ML models are: on a test set of models trained on data samples that were not seen during training ($D_{eval}$). A recent large body of literature has however shown that the main risk often comes from outliers, records that are statistically different from the rest of the dataset. In this work, we argue that the traditional evaluation setup for record-specific MIAs, which includes dataset sampling as a source of randomness, incorrectly captures the privacy risk. Indeed, what is an outlier is highly specific to particular data samples, and a record that is an outlier in the training dataset will not necessarily be one in the randomly sampled test datasets. We propose to use model randomness as the only source of randomness to evaluate record-level MIAs, a setup we call *model-seeded*. Across 10 combinations of models, datasets, and attacks for predictive and generative AI, we show the per-record risk estimates given by the traditional evaluation setup to substantially differ from ones given by the *model-seeded* setup which properly account for the increased risk posed by outliers. We show that across setups the traditional evaluation method leads to a substantial number of records to be incorrectly classified as low risk, emphasizing the inadequacy of the current setup to capture the record-level risk. We then a) provide evidence that the traditional setup is an average--across datasets--of the *model-seeded* risk, validating our use of model randomness to create evaluation models and b) show how relying on the traditional setup might conceal the existence of stronger attacks. The traditional setup would indeed strongly underestimate the risk posed by the strong Differential Privacy adversary. We believe our results to convincingly show the practice of randomizing datasets to evaluate record-specific MIAs to be incorrect. We then argue that relying on model randomness, an setup we call *model-seeded* evaluation, better captures the risk posed by outliers and should be used moving forward to evaluate record-level MIAs against machine learning models, both predictive and generative.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2905
Loading