Punctuation as Implicit Annotations for Chinese Word SegmentationDownload PDFOpen Website

Published: 2009, Last Modified: 15 May 2023Comput. Linguistics 2009Readers: Everyone
Abstract: We present a Chinese word segmentation model learned from punctuation marks which are perfect word delimiters. The learning is aided by a manually segmented corpus. Our method is considerably more effective than previous methods in unknown word recognition. This is a step toward addressing one of the toughest problems in Chinese word segmentation.
0 Replies

Loading