Self-Augmentation via Self-Reweighting: Unlocking Intrinsic Potential of Language Models for Cross-Encoded Conditional Semantic Textual Similarity Measurement
Abstract: Conditional Semantic Textual Similarity (C-STS) introduces specific limiting conditions to the traditional Semantic Textual Similarity (STS) task, posing challenges for various mainstream models. Language models employing cross-encoding demonstrate satisfactory performance in STS, yet their effectiveness significantly diminishes in C-STS. In this work, we argue that the failure of cross-encoding language models in C-STS is not due to their inability to extract effective features, but rather because they extract an excessive number of features, thereby diluting the impact of condition-relevant features. To alleviate this, we propose Self-Augmentation via Self-Reweighting, which does not require the introduction of any external auxiliary information. Instead, it amplifies the impact of condition-relevant features and suppresses condition-irrelevant features through model's intrinsic information. The self-reweighted outputs are used as a self-augmentation signal to enhance the model's original outputs. On the C-STS test set, our proposed method consistently improves the performance of all fine-tuning baseline models (up to around 3 points). Remarkably, it even enables models with smaller parameter scales to surpass the performance of zero-shot and few-shot prompted large language models (such as GPT-4) with substantially larger parameter scales.
Paper Type: long
Research Area: Semantics: Sentence-level Semantics, Textual Inference and Other areas
Contribution Types: NLP engineering experiment
Languages Studied: English
0 Replies
Loading