RetPur: Diffusion Purification Model for Defending Hash Retrieval Target Attacks

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Adversarial purification, Target attacks, Guided diffusion model, Hash retrieval
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Deep Neural Networks (DNNs) have harnessed their formidable representational capabilities to attain remarkable performance in image retrieval models. Nonetheless, in cases where malicious actors introduce adversarial perturbations into the test dataset, the retrieval model may readily yield results that are either irrelevant or intentionally manipulated by the attacker. Specifically, the targeted attack is notable for producing predefined results, thereby inflicting a more adverse impact on retrieval performance. While adversarial purification has demonstrated effectiveness in countering adversarial attacks, its application in retrieval tasks remains unexplored. Addressing these concerns, we introduce a free-trained purification model denoted as RetPur aimed at purifying adversarial test dataset, thereby mitigating the issue of targeted attacks within both uni-modal and cross-modal retrieval systems. RetPur employs a pre-trained diffusion model, offering a plug-and-play convenience, while utilizing adversarial samples as conditioning factors to guide image generation, thereby enhancing task accuracy. In terms of retrieval system architecture, our study pioneers the incorporation of adversarial purification tasks into uni-modal (Image-to-Image) and cross-modal (Image-to-Image, Image-to-Text) hash retrieval systems, specifically tailored to image retrieval scenarios. Furthermore, we explore the application of adversarial purification tasks to a wider array of attacks, including both generative and iterative approaches. Through an extensive series of experiments, it can be concluded that the purified dataset exhibits retrieval performance in the retrieval systems that is closely akin to that of the original dataset, even across different attacks and modalities.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6968
Loading