One Step Beyond: Keyword Extraction in German Utilising Surprisal from Topic Contexts

Published: 01 Jan 2022, Last Modified: 19 Feb 2025SAI (2) 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper describes a study on keyword extraction in German with a model that utilises Shannon information as a lexical feature. Lexical information content was derived from large, extra-sentential semantic contexts of words in the framework of the novel Topic Context Model. We observed that lexical information content increased the performance of a Recurrent Neural Network in keyword extraction, outperforming TexTRank and other two models, i.e., Named Entity Recognition and Latent Dirichlet Allocation used comparatively in this study.
Loading