Interpreting Word Embeddings with Eigenvector Analysis

Anonymous

Interpreting Word Embeddings with Eigenvector Analysis

Anonymous

Published: 16 Nov 2018, Last Modified: 05 May 2023NIPS 2018 Workshop IRASL Blind SubmissionReaders: Everyone

Abstract: Dense word vectors have proven their values in many downstream NLP tasks over the past few years. However, the dimensions of such embeddings are not easily interpretable. Out of the d-dimensions in a word vector, we would not be able to understand what high or low values mean. Previous approaches addressing this issue have mainly focused on either training sparse/non-negative constrained word embeddings, or post-processing standard pre-trained word embeddings. On the other hand, we analyze conventional word embeddings trained with Singular Value Decomposition, and reveal similar interpretability. We use a novel eigenvector analysis method inspired from Random Matrix Theory and show that semantically coherent groups not only form in the row space, but also the column space. This allows us to view individual word vector dimensions as human-interpretable semantic features.

TL;DR: Without requiring any constraints or post-processing, we show that the salient dimensions of word vectors can be interpreted as semantic features.

Keywords: word embeddings, eigenvector analysis, singular value decomposition, interpretability of word embeddings, random matrix theory

7 Replies

Loading