Pre-training CLIP against Data Poisoning with Optimal Transport-based Matching and Alignment

ACL ARR 2025 May Submission376 Authors

12 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recent studies have shown that Contrastive Language-Image Pre-training (CLIP) models are threatened by targeted data poisoning and backdoor attacks due to massive training image-caption pairs crawled from the Internet. Previous defense methods correct poisoned image-caption pairs by matching a new caption for each image. However, the matching process solely relies on the global representations of images and captions, overlooking fine-grained features of visual and textual features. It may introduce incorrect image-caption pairs and detriment the CLIP pre-training. To address their limitations, we propose an Optimal Transport-based framework to reconstruct the image-caption pairs, named \algo. We involve a new optimal transport-based distance measure between fine-grained visual and textual feature sets and re-assign new captions based on the proposed optimal transport distance. Additionally, to further reduce the negative impact of mismatched pairs, we encourage the inter- and intra-modality fine-grained alignment by employing optimal transport-based objective functions. Our experiments demonstrate that \algo can successfully decrease the attack success rates of poisoning attacks to 0\% in most cases. Also, compared to previous methods, \algo significantly improves CLIP's zero-shot and linear probing performance trained on poisoned datasets.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/fairness evaluation, pre-training poisoning, clip backdoor
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 376
Loading