End-to-end Speech Translation with Spoken-to-Written Style ConversionDownload PDF

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone
Abstract: End-to-end speech translation (ST), which translates speech in source language directly into text in target language by a single model, has attracted a great deal of attention in recent years. Compared to the cascade ST, it has the advantages of easier deployment, better efficiency, and less error propagation. Meanwhile, spoken-to-written style conversion has been proved to be able to improve cascaded ST by reducing the gap between the language style of speech transcription and bilingual corpora used for machine translation training. Therefore, it is desirable to integrate the conversion into end-to-end ST. In this paper, we propose a joint task of speech-to-written-style-text conversion and end-to-end ST, as well as an interactive-attention-based multi-decoder model for the joint task to improve end-to-end ST. Experiments on a Japanese-English lecture ST dataset and CoVoST 2 Native Japanese show that our models outperform a strong baseline on Japanese-English ST.
Paper Type: short
0 Replies

Loading