Abstract: Most of the previous studies on the Korean Named Entity Recognition (NER) topic focused on utilizing morphological-level information because the language is rich in character diversity. This paper illustrates an improved unigram-level Korean NER model with sub-character level representation, jamo, which can represent a unique linguistic structure of Korean and its syntactic properties and morphological variations. The experimental result shows that exploiting sub-character gives us a boost of + (avg) 2 F1, also, our proposed C-GRAM model outperformed about 3 F1 comparing with the baseline.
0 Replies
Loading