Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views

Junbo Zhang, Kaisheng Ma

2022 (modified: 29 Jan 2023)CVPR 2022Readers: Everyone

Abstract: A data augmentation module is utilized in contrastive learning to transform the given data example into two views, which is considered essential and irreplaceable. However, the pre-determined composition of multiple data augmentations brings two drawbacks. First, the artificial choice of augmentation types brings specific representational invariances to the model, which have different de-grees of positive and negative effects on different down-stream tasks. Treating each type of augmentation equally during training makes the model learn non-optimal repre-sentations for various downstream tasks and limits the flex-ibility to choose augmentation types beforehand. Second, the strong data augmentations used in classic contrastive learning methods may bring too much invariance in some cases, and fine- grained information that is essential to some downstream tasks may be lost. This paper proposes a gen-eral method to alleviate these two problems by considering “where” and “what” to contrast in a general contrastive learning framework. We first propose to learn different aug-mentation invariances at different depths of the model ac-cording to the importance of each data augmentation in-stead of learning representational invariances evenly in the backbone. We then propose to expand the contrast content with augmentation embeddings to reduce the misleading ef-fects of strong data augmentations. Experiments based on several baseline methods demonstrate that we learn better representations for various benchmarks on classification, detection, and segmentation downstream tasks.

0 Replies