GMW-Greek Misspelled Words
Abstract: This dataset contains 574,883 distinct Greek words and for each one of them it contains various misspellings, in average 4.32 misspellings per word. In total, it contains more than 3 million forms of Greek words (3,063,143). The misspelled words have been produced algorithmically. The dataset can be used for evaluating methods for approximate matching, error correction, etc.
External IDs:doi:10.5281/zenodo.14266910
Loading