A Morphological Processor for Russian with Extended Functionality

Published: 01 Jan 2017, Last Modified: 10 Aug 2024AIST 2017EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The paper presents an open-source morphological processor of Russian texts recently developed and named CrossMorphy. The processor performs lemmatization, morphological tagging of both dictionary and non-dictionary words, contextual and non-contextual morphological disambiguation, generation of word forms, as well as morphemic parsing of words. Besides the extended functionality, emphasis is put on linguistic quality of word processing and easy integration into programming projects. CrossMorphy is fully implemented in C++ programming language on the base of OpenCorpora vocabulary data. To clarify the reasons of its development, a comparison of several freely available morphological processors for Russian is given, across their linguistic and some technological properties. The experimental evaluation shows that CrossMorphy ensures rather high quality of word processing.
Loading