Evaluating Existing Lemmatisers on Unedited Byzantine Greek Poetry

Published: 01 Jan 2023, Last Modified: 18 May 2025ALP@RANLP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper reports on the results of a comparative evaluation in view of the development of a new lemmatizer for unedited, Byzantine Greek texts. For the experiment, the performance of four existing lemmatizers, all pre-trained on Ancient Greek texts, was evaluated on how well they could handle texts stemming from the Middle Ages and displaying quite some peculiarities. The aim of this study is to get insights into the pitfalls of existing lemmatistion approaches as well as the specific challenges of our Byzantine Greek corpus, in order to develop a lemmatizer that can cope with its peculiarities. The results of the experiment show an accuracy drop of 20pp. on our corpus, which is further investigated in a qualitative error analysis.
Loading