Spelling Correction for Estonian Learner LanguageDownload PDF

Published: 20 Mar 2023, Last Modified: 18 Apr 2023NoDaLiDa 2023Readers: Everyone
Keywords: spelling correction, context-sensitive spell-checking, learner language, Estonian
TL;DR: The aim was to evaluate n-gram-based statistical spell-checking algorithms on Estonian learner language and compare their performance to existing spelling correction tools (open-source and commercial).
Abstract: Second and foreign language (L2) learners often make specific spelling errors compared to native speakers. Language-independent spell-checking algorithms that rely on n-gram models can offer a simple solution for improving learner error detection and correction due to context-sensitivity. As the open-source speller previously available for Estonian is rule-based, our aim was to evaluate the performance of bi- and trigram-based statistical spelling correctors on an error-tagged set of A2–C1-level texts written by L2 learners of Estonian. The newly trained spell-checking models were compared to existing correction tools (open-source and commercial). Then, the best-performing Jamspell corrector was trained on various datasets to analyse their effect on the correction results.
Student Paper: Yes, the first author is a student
4 Replies

Loading