Abstract: Dual learning has been successfully applied in many machine learning applications, including machine translation, image-to-image transformation, etc. The high-level idea of dual learning is very intuitive: if we map an x from one domain to another and then map it back, we should recover the original x. Although its effectiveness has been empirically verified, theoretical understanding of dual learning is still missing. In this paper, we conduct a theoretical study to understand why and when dual learning can improve a mapping function. Based on the theoretical discoveries, we extend dual learning by introducing more related mappings and propose highly symmetric frameworks, cycle dual learning and multipath dual learning, in both of which we can leverage the feedback signals from additional domains to improve the qualities of the mappings. We prove that both cycle dual learning and multipath dual learning can boost the performance of standard dual learning under mild conditions. Experiments on WMT 14 English↔German and MultiUN English↔French translations verify our theoretical findings on dual learning, and the results on the translations among English, French, and Spanish of MultiUN demonstrate the efficacy of cycle dual learning and multipath dual learning.
Keywords: machine translation, dual learning
5 Replies
Loading