Abstract: Recently, SimCSE, a simple contrastive learning framework for sentence embeddings, has shown the feasibility of contrastive learning in training sentence embeddings and illustrates its expressiveness in spanning an aligned and uniform embedding space.
However, prior studies have shown that dense models could contain harmful parameters that affect the model performance.
This prompted us to consider whether SimCSE might also have similar harmful parameters.
To tackle the problem, parameter sparsification is applied, where alignment and uniformity scores are used to measure the contribution of each parameter to the overall quality of sentence embeddings.
Drawing from a preliminary study, we hypothesize that parameters with minimal contributions are detrimental, and sparsifying them would result in an improved model performance. Accordingly, a sparsified SimCSE (SparseCSE) is proposed.
To systematically explore the ubiquity of detrimental parameters and the removal of them, extensive experiments are conducted on the standard semantic textual similarity (STS) tasks and transfer learning tasks. The results show that the proposed SparseCSE significantly outperform SimCSE.
Furthermore, through an in-depth analysis, we establish the validity and stability of our sparsification method, showcasing that the embedding space generated by SparseCSE exhibits an improved alignment compared to that produced by SimCSE. Importantly, the uniformity remains uncompromised.
Paper Type: Long
Research Area: Semantics: Lexical and Sentence-Level
Research Area Keywords: sentence embedding, structure model pruning, lottery ticket hypothesis
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 5240
Loading