A Semantic Segmentation Algorithm Based on Contrastive Learning Using Aligned Feature Samples

Xibei Jia, Zhichao Lian

Published: 01 Jan 2022, Last Modified: 13 Nov 2023DASC/PiCom/CBDCom/CyberSciTech 2022Readers: Everyone

Abstract: With increasingly robust pixel-wise classifiers, semantic segmentation has made much progress. But these semantic segmentation methods classify each pixel only based on its feature vector and didn’t consider the relationship between them. The revival of contrastive learning in the unsupervised domain has caught our attention. It learns standard features between homogeneous instances, distinguishes differences between non-homogeneous cases, and improves the classification of the network by optimizing the distribution of image-level samples in the embedding space. We believe that contrastive learning can also optimize pixel-level samples’ feature space and enable more accurate segmentation predictions. We should also carefully consider how to reasonably migrate from the image level to the pixel level. Unlike existing approaches that enforce semantic labels on individual pixels and match tags between neighbouring pixels, we propose generating pixel-level samples and matching the semantic relations between adjacent pixels in the embedding space. The transformer’s ability to capture extended contexts makes pixel samples more comparable than CNN. We decided to add a pixel-wise contrastive loss between pixel samples extracted by the transformer encoder, which could shorten the distance between features of the same class and disperse different clusters. To ensure the consistency of comparison samples and labels, we also add elements to optimize its module. The final experiments show that optimizing the sample space can improve semantic segmentation results.

0 Replies