When Noises Help: Improve Text-Image Multimodal Contrastive Learning with Stochastic Label AugmentationsDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Contrastive learning~(CL) has been widely used for self-supervised representation learning in text-image multimodal representation learning. However, there are two setbacks in the SOTA contrastive learning framework. One lies in the design of contrastive learning, where the model aims to pull together positive pairs and push away negative pairs. For one image, CL only considers one unique text as its positive sample, and treat all remaining text data as negative samples. Such design inevitably brings in learning bias towards overfitting into specific data pairs. Another setback comes from the web-crawled datasets that are commonly used in CL such as Conceptual Caption, YFCC and LAION. These datasets brings benefit due to its large size, yet contain significant noisy or vague labels. In this paper, we examine how augmenting the ground-truth labels with randomness can bring significant improvements in text-image multimodal contrastive learning. Through the simple addition of noise to ground-truth labels, we observe substantial improvements in model performance and robustness, requiring no additional computational overhead. We introduce three distinct stochastic label augmentation strategies and evaluate their effectiveness across various benchmarks, including zero-shot transfer, distribution shift, and linear probing tasks. Furthermore, we conduct comprehensive experiments involving different model architectures and noise rates, demonstrating the generalizability and substantial benefits of stochastic label augmentation across diverse tasks and models.
Paper Type: short
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Contribution Types: Model analysis & interpretability, Reproduction study, Publicly available software and/or pre-trained models
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview