A Comparison of Character-Based Neural Machine Translations Techniques Applied to Spelling NormalizationOpen Website

2020 (modified: 29 Oct 2021)ICPR Workshops (7) 2020Readers: Everyone
Abstract: The lack of spelling conventions and the natural evolution of human language create a linguistic barrier inherent in historical documents. This barrier has always been a concern for scholars in humanities. In order to tackle this problem, spelling normalization aims to adapt a document’s orthography to modern standards. In this work, we evaluate several character-based neural machine translation normalization approaches—using modern documents to enrich the neural models. We evaluated these approaches on several datasets from different languages and time periods, reaching the conclusion that each approach is better suited for a different set of documents.
0 Replies

Loading