Word embedding re-examined: is the symmetrical factorization optimal?

Zhichao Han; Jia Li; Xu Li; Hong Cheng

Word embedding re-examined: is the symmetrical factorization optimal?

Zhichao Han, Jia Li, Xu Li, Hong Cheng

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: As observed in previous works, many word embedding methods exhibit two interesting properties: (1) words having similar semantic meanings are embedded closely; (2) analogy structure exists in the embedding space, such that ''emph{Paris} is to \emph{France} as \emph{Berlin} is to \emph{Germany}''. We theoretically analyze the inner mechanism leading to these nice properties. Specifically, the embedding can be viewed as a linear transformation from the word-context co-occurrence space to the embedding space. We reveal how the relative distances between nodes change during this transforming process. Such linear transformation will result in these good properties. Based on the analysis, we also provide the answer to a question whether the symmetrical factorization (e.g., \texttt{word2vec}) is better than traditional SVD method. We propose a method to improve the embedding further. The experiments on real datasets verify our analysis.

Code: https://www.dropbox.com/sh/5d5j4pthcgzutdf/AABUvZPJpxUo8ugff1gQ7fIQa?dl=0

Keywords: word embedding, matrix factorization, linear transformation, neighborhood structure

Original Pdf: pdf

7 Replies

Loading