Semantic-Guided Consistency and Discrimination for Siamese Representation Learning

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Contrastive learning, Siamese representation learning
Abstract: Recently, self-supervised representation learning with Siamese structure (Siamese representation learning) has shown promising results. Current methods commonly adopt instance discrimination to learn invariant global representations at the image-level from randomly cropped views, which risks introducing object-irrelevant nuisances of background information in the image-level representations, i.e., random cropping induces nuisances of background. Further works aiming to solve the problem simply match the visual patterns across views independently, failing to look into the foreground and background regions. Intuitively, the nuisances of background could be alleviated by separating foreground and background in random crops. Therefore, we present a new self-supervised learning framework, semantic-guided consistency and discrimination (SCD) that learns to separate the foreground and background semantics in random crops while learning image-level representations. Specifically, we extract foreground and background semantics by aggregating the global feature map encoding the image content, using the learned feature-level saliency maps (indicating the foreground pixels on feature maps) as weights. Then we construct triplets from the foreground and background semantics of the two augmented views and distinguish foreground from background with triplet loss. Our SCD strategy can easily be applied to existing Siamese representation learning frameworks, including contrastive learning (e.g., MoCo-v2) and non-contrastive learning (e.g., BYOL) paradigm. By applying our SCD to both paradigms, we show that our method can achieve consistent improvements on classification and dense prediction tasks.
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1976
Loading