CASE – Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement

CASE – Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement

ACL ARR 2025 May Submission1563 Authors

17 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The meaning conveyed by a sentence often depends on the context in which it appears. Despite the progress of sentence embedding methods, it remains unclear how to best modify a sentence embedding conditioned on its context. To address this problem, we propose Condition-Aware Sentence Embeddings (CASE), an efficient and accurate method to create an embedding for a sentence under a given condition. First, CASE creates an embedding for the condition using a Large Language Model (LLM), where the sentence influences the attention scores computed for the tokens in the condition during pooling. Next, a supervised nonlinear projection is learnt to reduce the dimensionality of the LLM-based text embeddings. We show that CASE significantly outperforms previously proposed Conditional Semantic Textual Similarity (C-STS) methods on an existing standard benchmark dataset. We find that subtracting the condition embedding will consistently improve the C-STS performance of LLM-based text embeddings. Moreover, we propose a supervised dimensionality reduction method that not only reduces the dimensionality of the LLM-based embeddings, but also significantly improves their performance.

Paper Type: Long

Research Area: Semantics: Lexical and Sentence-Level

Research Area Keywords: Sentence Embeddings, Conditional Semantic Textual Similarity, Representation Learning, Large Language Models, Dimensionality Reduction

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis

Languages Studied: English

Submission Number: 1563

Loading