Abstract: Previous research showed that Kalman filter based humancomputer interaction Chinese word segmentation algorithm achieves an encouraging effect in reducing user interventions. This paper designs an improved statistical model for ancient Chinese texts, and integrates it with the Kalman filter based framework. An online interactive system is presented to segment ancient Chinese corpora. Experiments showed that this approach has advantage in processing domain-specific text without the support of dictionaries or annotated corpora. Our improved statistical model outperformed the baseline model by 30% in segmentation precision.
0 Replies
Loading