Keywords: Model Inversion, Transfer Learning, Model Inversion Defense
TL;DR: This paper proposes a novel transfer learning-based approach to achieve Model Inversion (MI) robustness, involving parameter restrictions on private training data updates and demonstrating state-of-the-art performance across various MI setups.
Abstract: Model Inversion (MI) attacks aim to reconstruct private training data by abusing access to machine learning models. Contemporary MI attacks have achieved impressive attack performance posing serious threats to privacy. Meanwhile, all existing MI defense methods rely on regularization that has direct conflict with the training objective, resulting in noticeable degradation in model utility. **In this work**, we take a different perspective, and propose a novel and simple method based on transfer learning (TL) to render MI-robust models. Particularly, by leveraging TL, we limit the number of layers encoding sensitive information from private training dataset, thereby degrading the performance of MI attack. We conduct an analysis using Fisher Information to justify our method. Our defense is remarkably simple to implement. Without bells and whistles, we show in extensive experiments that our method achieves state-of-the-art (SOTA) MI robustness. **Our code, pre-trained models, demo and inverted data are included in Appx.**
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4680
Loading