Abstract: Mongolian is a kind of typical agglutinative language. The configuration of Mongolian expresses different grammatical meanings by connecting different affixes. The additional ingredients of agglutinative languages usually have only one meaning, and the connections have rules to follow. As a kind of agglutinative language, Mongolian can infer the part of speech and the semantic information of the stem from the additional components. Mongolian can also decide the collocation information matched with the stem. According to the characteristics of Mongolian word formation, and taking morphemes as partition granularity, this paper puts forward a method of fusing multi-features for Mongolian part of speech by using conditional random field model. Experiments show that this method obtains a satisfactory result with the part of speech tagging accuracy of 98.8%.
Loading