Abstract: Grammatical Error Correction (GEC) is usually considered as a translation task where an erroneous sentence is treated as the source language and the corrected sentence as the target language. The state-of-the-art GEC models often adopt transformer-based sequence-to-sequence architecture of machine translation. However, most of these approaches ignore the syntactic information because the syntax of an erroneous sentence is also full of errors and not beneficial to GEC. In this paper, we propose a novel Error-Correction Constituent Parsing (ECCP) task which uses the constituent parsing of corrected sentences to avoid the harmful effect of the erroneous sentence. We also propose an architecture that includes one encoder and two decoders. There are millions of parameters in transformer-based GEC models, and the labeled training data is substantially less than synthetic pre-training data. Therefore, adapter layers are added to the proposed architecture, and adapter tuning is used for fine-tuning our model to alleviate the low-resource issue. We conduct experiments on CoNLL-2014, BEA-2019, and JFLEG test datasets in unsupervised and supervised settings. Experimental results show that our method outperforms the-state-of-art baselines and achieves superior performance on all datasets.
Loading