Abstract: Chinese word segmentation (CWS) is an important research topic in information retrieval (IR) and natural language processing (NLP). Significant progresses have been made by deep neural networks with context features. However, these deep models may fail to deal with rare or ambiguous words, thus limit the overall CWS performance. In this paper, we propose a lexicon-enhanced adaptive attention network (LAAN), which takes full advantage of external lexicons to deal with the rare or ambiguous words. Specifically, we devise an adaptive attention mechanism to learn the lexicon-aware representation. In addition, we propose a fusion gate to effectively integrate the additional word information with context information to improve the performance of CWS. LAAN is evaluated on four benchmark datasets, and the experimental results demonstrate that LAAN has robust superiority over the compared methods.
0 Replies
Loading