Abstract: Self-supervised learning aims to learn applicable pre-trained models from massive unlabeled data. Besides image-level pretext tasks, many recent pixel-level studies have been pro-posed to learn dense information in each image. However, most of those methods focus on obtaining pair of matched patches from the same image with different augmentation. At the same time, little effort is devoted to exploiting matched patches from different images. In this work, we develop a novel pixel-level task that leverages an ensemble of nearest neighbors from multiple images to explore diverse objects in each image, especially for remote sensing data. Besides, a sampling strategy with a submodular function is adopted to efficiently update the memory bank consisting of patches. The extensive experiments on remote sensing data confirm the effectiveness of our method.
0 Replies
Loading