Abstract: Sentence representation learning can transform sentences into fixed format vectors, and provides foundation for downstream tasks such as information retrieval, semantic similarity analysis, etc. With the popularity of contrastive learning, sentence representation learning has also been further developed. At the same time, contrastive learning method based on momentum has achieved great success in computer vision. It solves the coupling between negative samples and batch size. But its expected performance is not observed in natural language processing tasks because the combination of data augmentation strategies is weak, and it only utilizes the samples in the momentum queue as negatives while ignoring those generated in current batch. In this paper, we propose eMoCo: enhanced Momentum Contrast to solve the above issues. We formulate a set of data augmentation strategies for text, and present a novel Dual-Negative loss to make full use of all negative samples. Extensive experiments on STS (Semantic Text Similarity) datasets show that our method outperforms the current state-of-the-art models, indicating its advantages in sentence representation learning.
Loading