Defending against Model Stealing via Verifying Embedded External FeaturesDownload PDF

Published: 21 Jun 2021, Last Modified: 22 Oct 2023ICML 2021 Workshop AML OralReaders: Everyone
Keywords: Model Stealing, Ownership Verification, Model Privacy, AI Security
Abstract: Well-trained models are valuable intellectual properties for their owners. Recent studies revealed that the adversaries can `steal' deployed models even when they have no training sample and can only query the model. Currently, there were some defense methods to alleviate this threat, mostly by increasing the cost of model stealing. In this paper, we explore the defense problem from another angle by \emph{verifying whether a suspicious model contains the knowledge of defender-specified external features}. We embed the \emph{external features} by \emph{poisoning} a few training samples via style transfer. After that, we train a meta-classifier, based on the gradient of predictions, to determine whether a suspicious model is stolen from the victim. Our method is inspired by the understanding that the stolen models should contain the knowledge of (external) features learned by the victim model. Experimental results demonstrate that our approach is effective in defending against different model stealing attacks simultaneously.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](
2 Replies