Keywords: Privacy breach detection, Two-Sample tests, membership inference
Abstract: With the proliferation of machine learning services, the risk of privacy breaches has never been higher, owing to the need for collecting -- sometimes by any means necessary -- valuable, yet sensitive training data. When an unsanctioned data access occurs, it may become apparent after the fact, in the predictive models that have been trained on compromised data. This calls for effective membership inference methods, enabling an evaluator to identify privacy breaches. Distinct from traditional membership inference attacks (MIAs), which focus on determining whether individual data records were used in training, this study centers on the evaluation of sets of records, particularly when only a small proportion of the set are training members. In this scenario, traditional MIAs often suffer from non-ideal evaluation reliability. To address this issue, from a privacy evaluator's perspective, we propose a novel approach for membership inference, applicable not to individual records but to sets thereof. It relies on a non-parametric two-sample test, which leverages the differences between high-level representation to infer membership. Based on extensive experiments, our proposed High-level Representation-based MMD (HR-MMD) test exhibits high sensitivity in distinguishing between the training and non-training sets, with ideal type I error, making it a powerful membership detection tool. Our study offers insights into an alternative privacy breach detection scenario and opens up a promising avenue for privacy evaluation based on membership inference tests.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10146
Loading