CycleGN: a Cycle Consistent approach for Neural Machine Translation training using the Transformer model in a shuffled datasetDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: It is possible to simultaneously train 2 NMT models using a non-parallel dataset using a Cycle Consistent-type approach
Abstract: CycleGN is a Transformer architecture using a Discriminator-less CycleGAN approach, specifically tailored for training Machine Translation models utilizing non-parallel datasets. Despite the widespread availability of large parallel corpora for numerous language pairs, the capacity to employ solely monolingual datasets would substantially expand the pool of training data. This approach is particularly beneficial for languages with scarce parallel text corpora. The foundational concept of our research posits that in an ideal scenario, translations of translations should revert to the original source sentences. Consequently, we can simultaneously train a pair of models using a Cycle Consistency Loss framework. This method bears resemblance to the technique of back-translation, prevalently employed in Machine Translation, where a pre-trained translation model is used to generate new examples from a monolingual corpus, thereby artificially creating a parallel dataset for further training and refinement.
Paper Type: long
Research Area: Machine Translation
Contribution Types: NLP engineering experiment
Languages Studied: english,german
0 Replies

Loading