Writing in two languages: Neural machine translation as an assistive bilingual writing tool. (Ecrire en deux langues: la traduction automatique neuronale au service d'aide à la rédaction bilingue)

Abstract: In an increasingly global world, more situations appear where people need to express themselves in a foreign language or multiple languages. However, for many people, writing in a foreign language is not an easy task. Machine translation tools can help generate texts in multiple languages. With the tangible progress in neural machine translation (NMT), translation technologies are delivering usable translations in a growing number of contexts. However, it is not yet realistic for NMT systems to produce error-free translations. Therefore, users with a good command of a given foreign language may find assistance from computer-aided translation technologies. In case of difficulties, users writing in a foreign language can access external resources such as dictionaries, terminologies, or bilingual concordancers. However, consulting these resources causes an interruption in the writing process and starts another cognitive activity. To make the process smoother, it is possible to extend writing assistant systems to support bilingual text composition. However, existing studies mainly focused on generating texts in a foreign language. We suggest that showing corresponding texts in the user's mother tongue can also help users to verify the composed texts with synchronized bitexts. In this thesis, we study techniques to build bilingual writing assistant systems that allow free composition in both languages and display synchronized monolingual texts in the two languages. We introduce two types of simulated interactive systems. The first solution allows users to compose mixed-language texts, which are then translated into their monolingual counterparts. We propose a dual decoder Transformer model comprising a shared encoder and two decoders to simultaneously produce texts in two languages. We also explore the dual decoder model for various other tasks, such as multi-target translation, bidirectional translation, generating translation variants, and multilingual subtitling. The second design aims to extend commercial online translation systems by letting users freely alternate between the two languages, changing the texting input box at their will. In this scenario, the technical challenge is to keep the two input texts synchronized while taking the users' inputs into account, again with the goal of authoring two equally good versions of the text. For this, we introduce a general bilingual synchronization task and implement and experiment with autoregressive and non-autoregressive synchronization systems. We also investigate bilingual synchronization models on specific downstream tasks, such as parallel corpus cleaning and NMT with translation memories, to study the generalization ability of the proposed models.
0 Replies
Loading