Gradient-Leaks: Enabling Black-Box Membership Inference Attacks Against Machine Learning ModelsDownload PDFOpen Website

Published: 01 Jan 2024, Last Modified: 14 Jan 2024IEEE Trans. Inf. Forensics Secur. 2024Readers: Everyone
Abstract: Machine Learning (ML) techniques have been applied to many real-world applications to perform a wide range of tasks. In practice, ML models are typically deployed as the black-box APIs to protect the model owner’s benefits and/or defend against various privacy attacks. In this paper, we present Gradient-Leaks as the first evidence showcasing the possibility of performing membership inference attacks (MIAs), with mere black-box access, which aim to determine whether a data record was utilized to train a given target ML model or not. The key idea of Gradient-Leaks is to construct a local ML model around the given record which locally approximates the target model’s prediction behavior. By extracting the membership information of the given record from the gradient of the substituted local model using an intentionally modified autoencoder, Gradient-Leaks can thus breach the membership privacy of the target model’s training data in an unsupervised manner, without any priori knowledge about the target model’s internals or its training data. Extensive experiments on different types of ML models with real-world datasets have shown that Gradient-Leaks can achieve a better performance compared with state-of-the-art attacks.
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview