Abstract: Chinese Spelling Check (CSC) is a challenging task to detect and correct wrong characters in Chinese sentences. Since most Chinese spelling mistakes are caused by visual or pronunciation similarities of characters, recent researches tend to utilize external phonological and morphological resources for this task. However, their works rely heavily on hand-constructed confusion sets and multimodal data, causing high labor costs. To this end, we propose an end-to-end generative model called PromptCSC. First, we notice that the misspelling of characters causes unnatural semantic incoherence in sentences. By using the prompt template as a knowledge probe, PromptCSC detects and outputs the error probability of each character in the sentence. The error locations are then corrected using BERT’s soft mask mechanism. Experimental results on the SIGHAN benchmarks show that our approach achieves excellent performance without external resources.
0 Replies
Loading