C2FMI: Corse-to-Fine Black-Box Model Inversion Attack

Published: 01 Jan 2024, Last Modified: 03 Mar 2025IEEE Trans. Dependable Secur. Comput. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Privacy-preserving machine learning requires that models do not reveal any private information about their training data. However, model inversion attacks (MIAs), which aim to recover the features of training data, pose a huge threat to the security of AI models. Most existing MIAs assume that the target model is white-box, but most models deployed in reality are black-box, and these models can only be accessed like an oracle. There are a few studies for black-box scenarios, but their performance is limited. In this article, we first formulate the MIA problem completely in Bayesian perspective. Second, we propose a novel two-stage MIA approach, the Coarse-to-Fine Model Inversion Attack (C2FMI), which efficiently addresses the MIA problem in the black-box scenario. In stage I of C2FMI, we design a reverse network that constrains the recovered images (also named attacked images) to fall near the manifold space of the training data. In stage II, we design a black-box oriented strategy which further facilitates the attacked images to approach the training data. Empirically, C2FMI achieves a performance that even surpasses existing white-box attack methods. Furthermore, we design the stability analysis method for analyzing the stability of C2FMI along with existing MIAs. Finally, we explore the potential countermeasures which could defend against our attacks.
Loading