SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented ParserDownload PDF

Anonymous

17 Apr 2022 (modified: 05 May 2023)ACL ARR 2022 April Blind SubmissionReaders: Everyone
Abstract: Despite their tremendous success, current cutting-edge grammatical error correction (GEC) models make little use of syntactic knowledge, which plays an important role when humans try to understand and fix ungrammatical sentences. This work proposes a syntax-enhanced GEC approach (SynGEC) to incorporate syntactic information into the encoder part of GEC models. The key challenge for this idea is that the performance of off-the-shelf parsers dramatically drops since they are usually trained on clean grammatical sentences. To confront this challenge, we propose to build a tailored GEC-oriented parser (GOPar) using parallel GEC training data as a pivot. First, we present an extended annotation scheme that allows us to represent both grammatical errors and syntactic structure under a unified tree structure. Then, we obtain parse trees of the source incorrect sentences by projecting trees of the target correct sentences and using them for training GOPar. We employ graph convolution networks to encode tree structures produced by GOPar. Experiments on three English/Chinese benchmark datasets show that our proposed SynGEC approach consistently and substantially outperforms the strong baselines and achieves new single-model state-of-the-art performance on all datasets.
Paper Type: long
0 Replies

Loading