Suffix inflectional morphology generation for Amharic text

Published: 03 Mar 2024, Last Modified: 11 Apr 2024AfricaNLP 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Morphology, suffixation, morphological generation, lemma
TL;DR: The paper is focusing on morphological generation for Amharic lemmas by adding suffix and letter series transformation.
Abstract: The Amharic language is morphologically rich language in which single lemma can form variety of words through inflection or derivation forms. Generating such variants of words manually for second language learners and NLP applications is challenging task that needs an automatic morphology generator tool. In this study, we have developed new Amharic morphology generator tool for inflecting lemmas of nouns and verbs to possessive, gender, and number forms. In case of possessive inflection, nouns can be inflected for both singular and plural forms while verbs can be inflected only for singular forms. For number inflection both nouns and verbs can be inflected. To construct these rules, we have followed Amharic word affixation rules of linguists. Before we apply the suffixation and letter series transformation rule we have analyzed the word’s root form in the sentence which helps us to accurately apply the new inflected word formation rules based on the lemmas POS. Finally, we have evaluated the performance of the tool by comparing the inflected form result generated by linguists and the tool generates 76.9% accuracy compared with linguists-generated results. So as the result shows Amharic common nouns, mass nouns, and verbs suffix inflection form is generated correctly while the tool considers some proper nouns as common nouns to generate their inflected forms that need to be optimized in further studies.
Submission Number: 5
Loading