Context Beyond Grammar: Synonym Substitution for Korean Grammatical Error Correction in Specialized Texts

ACL ARR 2024 December Submission1045 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Many previous studies on grammatical error correction (GEC) have primarily focused on language learner corpora, which consist of texts written by learners acquiring a non-native language. In this study, we address a GEC task that involves selecting contextually appropriate words in texts containing domain-specific vocabulary. We propose the UniGEC (Unified-Replacement GEC) dataset, which combines results from multiple models to determine the likelihood of substituting synonyms for specific keywords, based on token occurrence probabilities. Our experiments show that the UniGEC presents a more challenging task compared to language learner corpora. We observed that as the number of synonyms increases, the performance gap widens. Furthermore, we found significant performance variations across different domains, highlighting the need for further exploration of synonym substitution in specialized texts to expand the applicability of GEC tasks to a wider range of scenarios.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: Grammatical Error Correction, Synonym, Context
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: Korean
Submission Number: 1045
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview