Abstract: With growing concerns about information security, protecting the privacy of user-sensitive data has become crucial. The rapid development of multi-modal retrieval technologies poses new threats, making sensitive data more vulnerable to leakage and malicious mining. To address this, we introduce a Proactive Adversarial Multi-modal Learning (PAML) approach that transforms sensitive data into adversarial counterparts, evading malicious multi-modal retrieval and ensuring privacy. Our method starts by sending queries to a knowledge-agnostic retrieval system and analyzing the results to understand the retrieval feedback mechanism. Using a U-Net-based diffusion model, we create a semantic perturbation network that subtly alters the implicit semantics of sensitive data. This, combined with multi-modal retrieved results and random noise, shifts the data's semantics towards outliers, preventing retrieval as neighbors to relevant queries. Additionally, a discriminator and pre-trained model enhance the visual realism and outlier generalization of protected data. Extensive experiments show that PAML outperforms potential baselines in data privacy protection. Ablation analysis validates each component's effectiveness, and our approach's variants are applicable to diverse retrieval systems.
Loading