Inferring Private Data from AI Models in Metaverse through Black-box Model Inversion Attacks

Published: 01 Jan 2023, Last Modified: 01 Oct 2024MetaCom 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The widespread application of artificial intelligence technologies in metaverse introduces significant privacy concerns. It is critical to study the training information leakage of AI models during the interaction of metaverse. Model inversion attacks have revealed the privacy vulnerability of deep learning models through reconstructing their training data during predictions. In this paper, we reconstruct the training samples of AI models (target model) in metaverse under a more practical threat model, where the adversary only has black-box access to the target model and no side information besides auxiliary dataset. We propose a contrastive supervised model inversion attack (CSMI). Specifically, we modify contrastive learning to train a neural network (projector) for inferring semantical knowledge contained in target model’s outputs. Afterwards, we design a supervised inversion model similar to the architecture of conditional GAN, where the projected outputs of the target model are involved as conditional inputs to supervise the training process. Finally, to generate inversion samples, we propose a bi-level random search strategy to search proper inputs of the trained inversion model through an objective function, which consists of the attacking success rate and the qualities of the reconstructed image. We conduct extensive experiments to evaluate the performance of the proposed CSMI. The experimental results show that samples reconstructed by CSMI are more visually plausible and reveal more features of the target than the state-of-the-art methods under a black-box setting.
Loading