Harnessing Dimensional Contrast and Information Compensation for Sentence Embedding Enhancement

Published: 01 Jan 2025, Last Modified: 04 Nov 2025ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Unsupervised sentence embedding learning excels through positive sample construction and instance-level contrastive learning (ICL). However, this approach can lead to over-compression and dimensional contamination from noisy data augmentation and unconstrained ICL processes. To mitigate these issues, we propose a novel enhancement method, MSSE, which incorporates an Information Compensation Mechanism (ICM) and a Dimensional-Level Contrastive Learning Mechanism (DCM). ICM, inspired by the information bottleneck principle, prevents excessive compression in representation learning. DCM constrains ICL and minimizes information leakage across dimensions. Experimental results show that MSSE surpasses existing competitive baselines across seven STS tasks, including unsupervised and few-shot scenarios. The source code is available at https://github.com/Hekang001/MSSE.
Loading