Multi-Level Contrastive Learning for Dense Prediction TaskDownload PDF

22 Sept 2022 (modified: 12 Mar 2024)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Self-supervised learning, Detection, Segmentation
Abstract: In this work, we present Multi-Level Contrastive Learning for Dense Prediction Task (MCL), an efficient self-supervised method to learn region-level feature representation for dense prediction tasks. This approach is motivated by the three key factors in detection: localization, scale consistency and recognition. Considering the above factors, we design a novel pretext task, which explicitly encodes absolute position and scale information simultaneously by assembling multi-scale images in a montage manner to mimic multi-object scenario. Unlike the existing image-level self-supervised methods, our method constructs a multi-level contrastive loss by considering each sub-region of the montage image as a singleton to learn a regional semantic representation for translation and scale consistency, while reducing the pre-training epochs to the same as supervised pre-training. Extensive experiments show that MCL consistently outperforms the recent state-of-the-art methods on various datasets with significant margins. In particular, MCL obtains 42.5 AP^bb and 38.3 AP^mk on COCO with the 1x schedule and surpasses MoCo by 4.0 AP^bb and 3.1 AP^mk, when using Mask R-CNN with an R50-FPN backbone pre-trained with 100 epochs. In addition, we further explore the alignment between pretext task and downstream tasks. We extend our pretext task to supervised pre-training, which achieves a similar performance with self-supervised learning, demonstrating the importance of the alignment between pretext task and downstream tasks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning
TL;DR: Multi-Level Contrastive Learning is an efficient self-supervised method to learn region-level feature representation for dense prediction tasks.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2304.02010/code)
5 Replies

Loading