Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
LEARNING SEMANTIC WORD RESPRESENTATIONS VIA TENSOR FACTORIZATION
Nov 03, 2017 (modified: Nov 03, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Many state-of-the-art word embedding techniques involve factorization of a cooccurrence
based matrix. We aim to extend this approach by studying word embedding
techniques that involve factorization of co-occurrence based tensors (N-
way arrays). We present two new word embedding techniques based on tensor
factorization and show that they outperform common methods when used for several
semantic NLP tasks when trained with the same data. To train one of the
embeddings, we present a new joint tensor factorization problem and an approach
for solving it. Furthermore, we modify the performance metrics for the Outlier
Detection Camacho-Collados & Navigli (2016) task to measure the quality
of higher-order relationships that a word embedding captures. Our tensor-based
methods significantly outperform existing methods at this task when using our
new metric. Finally, we demonstrate that vectors in our embeddings can be composed
multiplicatively to create different vector representations for each meaning
of a polysemous word in a way that cannot be done with other common embeddings.
We show that this property stems from the higher order information that
the vectors contain, and thus is unique to our tensor based embeddings.
Keywords:Word Embeddings, Tensor Factorization, Natural Language Processing
Enter your feedback below and we'll get back to you as soon as possible.