LLMs in alliance with Edit-based Models: Advancing In-Context Learning for Grammatical Error Correction by Specific Example Selection

Alexey Sorokin, Regina Nasyrova

Published: 30 Jun 2025, Last Modified: 17 Mar 202620th Workshop on Innovative Use of NLP for Building Educational ApplicationsEveryoneCC BY 4.0

Abstract: We show that fewshot Grammatical Error Correction might be improved by using an encoderbased sequence labeling model, such as GECTOR, to select similar examples. We demonstrate this on three Russian GEC corpora and English BEA corpus. The effect is the most significant for the new LORuGEC corpus and reaches up to 5-10% F0. 5-score depending on the model. The corpus is released in our paper and contains 348 train and 612 test examples. The corpus is designed for diagnostic purposes and is also equipped with writing rules’ annotations. These annotations allow to further improve fewshot error correction by contrastive tuning of GECTOR-like encoder on rule classification task. This holds for a broad class of large language models. The best results are obtained with 5-shot YandexGPT-5 Pro model, achieving F0. 5-score of 83%.