Keywords: MixMask, Masked Siamese Networks, Self-supervised Learning
Abstract: Recent advances in self-supervised learning integrate masked modeling and Siamese Networks into a single framework to fully reap the advantages of both the two techniques. However, these approaches simply inherit the default loss design from previous siamese networks and ignore the distance change after employing masking operation in the frameworks. In this paper, we propose a filling-based masking strategy called MixMask to prevent information loss in vanilla masking method due to the randomly erased areas in an image. We further introduce a dynamic loss function design with soft distance to adapt the integrated architecture and avoid mismatches between transformed input and objective in Masked Siamese ConvNets (MSCN). The dynamic loss distance is calculated according to the mix-masking scheme. Extensive experiments are conducted on various datasets of CIFAR-100, Tiny-ImageNet and ImangeNet-1K. The results demonstrate that the proposed framework can achieve better accuracy on linear evaluation and semi-supervised learning, which outperforms the state-of-the-art MSCN by a significant margin. We also show the superiority on downstream tasks of object detection and segmentation. Our source code will be publicly available.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning
TL;DR: A New Masking Strategy for Masked Siamese Self-supervised Learning
Community Implementations: [ 1 code implementation](https://www.catalyzex.com/paper/mixmask-revisiting-masked-siamese-self/code)
6 Replies
Loading