Learning Robust Negation Text Representations

ACL ARR 2024 December Submission1185 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Despite rapid adoption of autoregressive large language models, smaller text encoders still play an important role in text understanding tasks that require rich contextualized representations. Negation is an important semantic function that is still not properly captured by such methods, affecting many downstream applications relying on text embeddings. We propose a strategy to improve negation robustness of text encoders, by distilling data from large language models using diverse patterns of negation and hedging. We adopt a standard contrastive learning strategy to finetune a strong BERT-based model, and observe large improvement in negation understanding capabilities while maintaining competitive performance on general benchmarks. In addition, we also show that our method can be adapted to LLMs, leading to improved performance on negation benchmarks.
Paper Type: Long
Research Area: Semantics: Lexical and Sentence-Level
Research Area Keywords: negation understanding, sentence embedding, synthetic data
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 1185
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview