Model Inversion Robustness: Can Transfer Learning Help?

Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran, Ngoc-Bao Nguyen, Ngai-man Cheung

Published: 17 Jun 2024, Last Modified: 23 May 2024CVPR 2024EveryoneCC BY-NC 4.0

Abstract: Model Inversion (MI) attacks aim to reconstruct pri- vate training data by abusing access to machine learn- ing models. Contemporary MI attacks have achieved im- pressive attack performance, posing serious threats to pri- vacy. Meanwhile, all existing MI defense methods rely on regularization that is in direct conflict with the train- ing objective, resulting in noticeable degradation in model utility. In this work, we take a different perspective, and propose a novel and simple Transfer Learning-based Defense against Model Inversion (TL-DMI) to render MI- robust models. Particularly, by leveraging TL, we limit the number of layers encoding sensitive information from private training dataset, thereby degrading the performance of MI attack. We conduct an analysis using Fisher In- formation to justify our method. Our defense is remark- ably simple to implement. Without bells and whistles, we show in extensive experiments that TL-DMI achieves state-of-the-art (SOTA) MI robustness. Our code, pre- trained models, demo and inverted data are available at: https://hosytuyen.github.io/projects/TL-DMI