Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation

Andreas Zollmann, Ashish Venugopal, Stephan Vogel

2006 (modified: 16 Jul 2019)HLT-NAACL 2006Readers: Everyone

Abstract: Statistical machine translation (SMT) is based on the ability to effectively learn word and phrase relationships from parallel corpora, a process which is considerably more difficult when the extent of morphological expression differs significantly across the source and target languages. We present techniques that select appropriate word segmentations in the morphologically rich source language based on contextual relationships in the target language. Our results take advantage of existing word level morphological analysis components to improve translation quality above state-of-the-art on a limited-data Arabic to English speech translation task.

0 Replies