Interpreting Word Embeddings Using a Distribution Agnostic Approach Employing Hellinger Distance

Tamás Ficsor, Gábor Berend

Published: 01 Jan 2020, Last Modified: 27 Jun 2023TDS 2020Readers: Everyone

Abstract: Word embeddings can encode semantic and syntactic features and have achieved many recent successes in solving NLP tasks. Despite their successes, it is not trivial to directly extract lexical information out of them. In this paper, we propose a transformation of the embedding space to a more interpretable one using the Hellinger distance. We additionally suggest a distribution-agnostic approach using Kernel Density Estimation. A method is introduced to measure the interpretability of the word embeddings. Our results suggest that Hellinger based calculation gives a 1.35% improvement on average over the Bhattacharyya distance in terms of interpretability and adapts better to unknown words.

0 Replies