Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline

Published: 01 Jan 2024, Last Modified: 19 Feb 2025SPECOM (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Cross-lingual speech-to-speech translation, which enables spoken language conversion from one language to another, plays a pivotal role in overcoming language barriers and promoting cross-cultural communication. The proliferation of multimedia content poses challenges for audiences to efficiently consume extended audio, such as news broadcasts, academic lectures, and political speeches. To address this, we propose a novel investigation of summarization in the context of cross-lingual speech-to-speech translation (S2S-Summ) with a focus on low-resource Indic languages. To the best of our knowledge, this task has not been explored in prior research. We develop and present a semi-synthetic dataset of translated summaries in Hindi (Hi), Bengali (Bn), Gujarati (Gu), and Tamil (Ta) languages, alongside baseline models for this task. The performance of our models is evaluated using metrics such as BERTScore, ROUGE, and UniEval. Our study aims to catalyze further exploration in this area, facilitating streamlined access to multilingual audio content and enhancing information dissemination across linguistic boundaries. Code and data is available at https://github.com/pranavkarande/S2S-Summ.
Loading