TrapNet: Model Inversion Defense via Trapdoor

Wanlun Ma, Derui Wang, Yiliao Song, Minhui Xue, Sheng Wen, Zhengdao Li, Yang Xiang

Published: 2025, Last Modified: 06 May 2026IEEE Trans. Inf. Forensics Secur. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Model inversion (MI) attacks, for which effective defense strategies are still lacking, pose significant risks to privacy by reconstructing private training data through access to well-trained classifiers. Addressing this concern, this study introduces TrapNet, designed to defend against advanced MI attacks while maintaining good model utility. TrapNet intentionally injects trapdoors into the classification manifold of the protected target model. In this way, TrapNet can effectively mislead MI attack optimization. Specifically, TrapNet leverages a conditional GAN (cGAN) trained on the private dataset to generate diverse and realistic trapdoor samples. In addition, we propose a graph-matching self-obfuscation strategy and an entropy regularization technique to optimize trapdoor injection while preserving model utility. Compared to the existing defense, TrapNet can provide universal protection to all target classes without access to any auxiliary public data. Extensive experiments on CelebA, VGG-Face, and VGG-Face2 datasets demonstrate TrapNet’s superior performance over existing defenses, including the most advanced NetGuard and BiDO, against state-of-the-art model inversion attacks, i.e., PLG-MI, LOMMA, and Plug&Play.