Chinese Spelling Check with Nearest NeighborsDownload PDF

Anonymous

16 Dec 2022 (modified: 05 May 2023)ACL ARR 2022 December Blind SubmissionReaders: Everyone
Abstract: Chinese Spelling Check (CSC) aims to detect and correct error tokens in Chinese contexts, which has a wide range of applications.In this paper, we introduce InfoKNN-CSC, which extends the standard CSC model by linearly interpolating it with a $k$-nearest neighbors model.Moreover, the phonetic, graphic, and contextual information of tokens and contexts are elaborately incorporated into the design of the query and key of $k$NN, according to the characteristics of the task.After retrieval, in order to match the candidates more accurately, we also perform reranking methods based on the overlap of the n-gram values and inputs.Experiments on the SIGHAN benchmarks demonstrate that the proposed model achieves state-of-the-art performance with substantial improvements over existing work.
Paper Type: long
Research Area: NLP Applications
0 Replies

Loading