STMS: An Out-Of-Distribution Model Stealing Method Based on Causality

Yunfei Yang, Xiaojun Chen, Zhendong Zhao, Yuexin Xuan, Bisheng Tang, Xiaoying Li

Published: 2024, Last Modified: 12 Jun 2025IJCNN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Machine learning, particularly deep learning, is extensively applied in various real-life scenarios. However, recent research has highlighted the severe infringement of privacy and intellectual property caused by model stealing attacks. Therefore, more researchers are dedicated to studying the principles and methods of such attacks to promote the security development of artificial intelligence. Most of the existing model stealing attacks rely on prior information of the attacked models and consider ideal conditions. In order to better understand and defend against model stealing in real-world scenarios, we propose a novel model stealing method, named STMS, based on causal inference learning. For the first time, we introduce the problem of out-of-distribution generalization into the model stealing domain. The proposed approach operates under more challenging conditions, where the training and testing data of the target model are unknown, black-box, hard-label outputs, and there is a distribution shift during the testing phase. STMS achieves comparable or better stealing accuracy and generalization performance than prior works on multiple datasets and tasks. Moreover, this universal framework can be applied to improve the effectiveness of other model stealing methods and can also be migrated to other areas of machine learning.