Special-purpose Model Extraction Attacks: Stealing Coarse Model with Fewer Queries

Rina Okada, Zen Ishikura, Toshiki Shibahara, Satoshi Hasegawa

Published: 01 Jan 2020, Last Modified: 12 May 2023TrustCom 2020Readers: Everyone

Abstract: Model extraction (ME) attacks have been shown to cause financial losses for Machine-Learning-as-a-Service (MLaaS) providers. Attackers steal ML models on MLaaS platforms by building substitute models using queries to and responses from MLaaS platforms. The ML models targeted by attackers are called targeted models. In previous studies, researchers have assumed that attackers build substitute models that classify the same number of classes as targeted ones, which classify thousands or millions of classes to meet users' diverse expectations. We call such models general-purpose models. In fact, attackers can monetize stolen models if they accurately distinguish some classes from others. We call such models special-purpose models. For instance, a model that detects vehicles is useful for collision avoidance systems, and a model that detects wild animals is useful to drive them away from agricultural land. In this work, we investigate a threat of special-purpose ME attacks that steal special-purpose models. Our experimental results show that attackers can build an accurate special-purpose model, which achieves an 80% f-measure, with as few as 100 queries in the worst case. We discuss the difficulty in preventing the attacks with previously proposed defense methods and point out the necessity of a new defense method.

0 Replies