Subword Attention and Post-Processing for Rare and Unknown Contextualized EmbeddingsDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: We propose a new architecture for estimating rare/unknown embeddings in contextualized models, which uses subword attention and post-processing.
Abstract: Word representations are an important aspect of Natural Language Processing (NLP). Representations are trained using large corpuses, either as independent static embeddings or as part of a deep contextualized model. While word embeddings are useful, they struggle on rare and unknown words. As such, a large body of work has been done on estimating rare and unknown words. However, most of the methods focus on static embeddings, with few models focused on contextualized representations. In this work, we propose SPRUCE, a rare/unknown embedding architecture that focuses on contextualized representations. This architecture uses subword attention and embedding post-processing combined with the contextualized model to produce high quality embeddings. We then demonstrate these techniques lead to improved performance in most intrinsic and downstream tasks.
Paper Type: short
Research Area: Machine Learning for NLP
Contribution Types: NLP engineering experiment
Languages Studied: English
0 Replies

Loading