Multi-Level Contrastive Learning for Dense Prediction Task

Qiushan Guo; Yizhou Yu; Jiannan Wu; Yi Jiang; Dongdong Yu; Zehuan Yuan; Ping Luo

Multi-Level Contrastive Learning for Dense Prediction Task

Qiushan Guo, Yizhou Yu, Jiannan Wu, Yi Jiang, Dongdong Yu, Zehuan Yuan, Ping Luo

22 Sept 2022 (modified: 04 Aug 2025)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Self-supervised learning, Detection, Segmentation

Abstract: In this work, we present Multi-Level Contrastive Learning for Dense Prediction Task (MCL), an efficient self-supervised method to learn region-level feature representation for dense prediction tasks. This approach is motivated by the three key factors in detection: localization, scale consistency and recognition. Considering the above factors, we design a novel pretext task, which explicitly encodes absolute position and scale information simultaneously by assembling multi-scale images in a montage manner to mimic multi-object scenario. Unlike the existing image-level self-supervised methods, our method constructs a multi-level contrastive loss by considering each sub-region of the montage image as a singleton to learn a regional semantic representation for translation and scale consistency, while reducing the pre-training epochs to the same as supervised pre-training. Extensive experiments show that MCL consistently outperforms the recent state-of-the-art methods on various datasets with significant margins. In particular, MCL obtains 42.5 AP^bb and 38.3 AP^mk on COCO with the 1x schedule and surpasses MoCo by 4.0 AP^bb and 3.1 AP^mk, when using Mask R-CNN with an R50-FPN backbone pre-trained with 100 epochs. In addition, we further explore the alignment between pretext task and downstream tasks. We extend our pretext task to supervised pre-training, which achieves a similar performance with self-supervised learning, demonstrating the importance of the alignment between pretext task and downstream tasks.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning

TL;DR: Multi-Level Contrastive Learning is an efficient self-supervised method to learn region-level feature representation for dense prediction tasks.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/multi-level-contrastive-learning-for-dense/code)

5 Replies

Loading