Self-supervised Post-processing Method to Enrich Pretrained Word Vectors

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX
Submission Type: Regular Short Paper
Submission Track: Semantics: Lexical
Submission Track 2: Machine Learning for NLP
Keywords: Retrofitting, Word Embedding, Word Semantics
TL;DR: We propose an extension to a retrofitting approach (especially, extrofitting) for semantically enriching word vectors that does not require external lexical constraints.
Abstract: Retrofitting techniques, which inject external resources into word representations, have compensated for the weakness of distributed representations in semantic and relational knowledge between words. However, the previous methods require additional external resources and strongly depend on the lexicon. To address the issues, we propose a simple extension of extrofitting, self-supervised extrofitting: extrofitting by its own word vector distribution. Our methods improve the vanilla embeddings on all of word similarity tasks without any external resources. Moreover, the method is also effective in various languages, which implies that our method will be useful in lexicon-scarce languages. As downstream tasks, we show its benefits in dialogue state tracking and text classification tasks, reporting better and generalized results compared to other word vector specialization methods.
Submission Number: 915
Loading