Semantic Adapter for Universal Text Embeddings: Diagnosing and Mitigating Negation Blindness to Enhance Universality
Abstract: Text embeddings are crucial for various natural language processing tasks, gaining popularity in both industry and academia. Recent progress in training data and LLMs has greatly enhanced the development of universal text embeddings, aiming for a single unified model that can handle diverse tasks, domains and languages. However, due to biases inherent in popular evaluation benchmarks, certain perspectives or abilities of these models remain unassessed. One such overlooked aspect is the models’ capacity for negation awareness. To address this gap in the existing literature, this paper presents a comprehensive analysis of the negation awareness in state-of-the-art universal text embedding models. Our investigation reveals a substantial deficiency in these models, as they frequently interpret negated text pairs as semantically similar. This flaw undermines their effectiveness in accurately understanding negated statements. To mitigate this issue and enhance the universality of text embeddings, we introduce a lightweight, parameter-free negation adapter. The proposed solution involves a data-efficient and computational-efficient embedding re-weighting method that does not require modifications to the parameters of existing text embedding models. The proposed solution is able to improve text embedding models’ negation awareness significantly on both simple negation understanding task and complex negation understanding task. Furthermore, the proposed solution can also significantly improve the negation awareness of Large Language Model based task-specific high dimensional universal text embeddings. These results not only bridge a key “universality gap” but also pave the way for a modular semantic adapter paradigm toward more universal, robust, and environmentally conscious text embeddings.
External IDs:doi:10.3233/faia251305
Loading