Unsupervised Quality Estimation Without Reference Corpus for Subtitle Machine Translation Using Word Embeddings

Published: 01 Jan 2019, Last Modified: 31 Dec 2024ICSC 2019EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We demonstrate the potential for using aligned bilingual word embeddings to create an unsupervised method to evaluate machine translations without a need for parallel translation corpus or reference corpus. We explain why movie subtitles differ from other text and share our experimental results conducted on them for four target languages (French, German, Portuguese and Spanish) with English source subtitles. We propose a novel automated evaluation method of calculating edits (insertion, deletion, substitution and shifts) to indicate translation quality and human aided post edit requirements to perfect machine translation.
Loading