Keywords: Privacy attacks, Model extraction, Membership inference, Black-box attack, Query efficiency, Active learning
Abstract: In this paper, we study black-box model stealing attacks where the attacker is only able to query a machine learning model through publicly available APIs. Specifically, our aim is to design a black-box model stealing attack that uses a minimal number of queries to create an informative replica of the target model. First, we reduce this problem to an online variational optimization problem. At every step, the attacker solves this problem to select the most informative query that maximizes the entropy of the selected queries and simultaneously reduces the mismatch between the target and the stolen models. We propose an online and adaptive algorithm, Marich, that leverages active learning to select the queries. We instantiate efficiency of our attack against different models, including logistic regression, BERT and ResNet18, trained on different text and image datasets. Marich is able to steal a model that can achieve 70-96$\%$ of true model's accuracy using 0.8-10$\%$ samples from the attack datasets which are publicly available and different from the training datasets. Our stolen models also achieve 75-98$\%$ accuracy of membership inference and also show 70-90$\%$ agreement of membership inference with direct membership inference on the target models. Our experiments validate that Marich is query-efficient and capable of creating an informative replica of the target model.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
Supplementary Material: zip
4 Replies
Loading