Computing Classifier-Based Embeddings with the Help of Text2ddc

Published: 01 Jan 2019, Last Modified: 22 Jun 2025CICLing (2) 2019EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We introduce a method for computing classifier-based semantic spaces on top of text2ddc . To this end, we optimize text2ddc, a neural network-based classifier for the Dewey Decimal Classification (DDC). By using a wide range of linguistic features, including sense embeddings, we achieve an F-score of 87,4%. To show that our approach is language independent, we evaluate text2ddc by classifying texts in six different languages. Based thereon, we develop a topic model that generates probability distributions over topics for linguistic input at the word (sense), sentence and text level. In contrast to related approaches, these probabilities are estimated with text2ddc, so that each dimension of the resulting embeddings corresponds to a separate DDC class. We finally evaluate this Classifier-based Semantic space (CaSe) in the context of text classification and show that it improves the classification results.
Loading