Comprehensive Evaluation of Word Embeddings for Highly Inflectional LanguageOpen Website

Published: 2021, Last Modified: 15 Oct 2023ICCCI (CCIS Volume) 2021Readers: Everyone
Abstract: The purpose of this paper is to present the experiments aiming at choosing the best word embeddings for highly inflectional languages. In particular, authors evaluated the word embeddings for Polish language among those available in the literature at the time of writing. The static embeddings like Word2Vec, GloVe, fasttext and their training settings were taken into account. In particular, the evaluation coverted 121 different embedding models provided by IPI PAN, OPI, Kyubyong and Facebook. The experiment phase was divided into two tasks: the first task consisted in examining word analogies and the second verified the similarities and the relatedness of pairs of words. The obtained results showed that in terms of accuracy the Facebook fasttext model learned on the Common Crawl collection should be considered the best model under assumptions of experimental session.
0 Replies

Loading