Corpus and unsupervised benchmark: Towards Tagalog grammatical error correction

Published: 01 Jan 2025, Last Modified: 17 Dec 2024Comput. Speech Lang. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We construct the first Tagalog GEC evaluation corpus.•Our unsupervised GEC framework is independent of any data annotations.•Our proposed pseudo-perplexity scoring method evaluates a sentence’s likely validity.•Experimental results on two corpora verify the effectiveness of the proposed model.
Loading