Debiased Contrastive Learning of Unsupervised Sentence RepresentationsDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Recently, contrastive learning has shown effectiveness in fine-tuning pre-trained language models (PLM) to derive sentence representations, which pulls augmented positive examples together to improve the alignment while pushing apart irrelevant negatives for the uniformity of the whole representation space. However, previous works mostly sample negatives from the batch or training data at random. It may cause sampling bias that improper negatives (\eg false negatives and anisotropy representations) will be learned by sentence representations, and hurt the uniformity of the representation space. To solve it, we present a new framework \textbf{DCLR} to alleviate the influence of sampling bias. In DCLR, we design an instance weighting method to punish false negatives and generate noise-based negatives to guarantee the uniformity of the representation space. Experiments on 7 semantic textual similarity tasks show that our approach is more effective than competitive baselines. Our codes and data will be released to reproduce all the experiments.
0 Replies

Loading