LLMs Are Zero-Shot Context-Aware Simultaneous Translators

Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

Published: 31 Aug 2024, Last Modified: 06 Feb 2025Proceedings of the 2024 Conference on Empirical Methods in Natural Language ProcessingEveryoneCC BY 4.0

Abstract: The advent of transformers has fueled progress in machine translation. More recently large language models (LLMs) have come to the spotlight thanks to their generality and strong performance in a wide range of language tasks, including translation. Here we show that open-source LLMs perform on par with or better than some state-of-the-art baselines in simultaneous machine translation (SiMT) tasks, zero-shot. We also demonstrate that injection of minimal background information, which is easy with an LLM, brings further performance gains, especially on challenging technical subject-matter. This highlights LLMs’ potential for building next generation of massively multilingual, context-aware and terminologically accurate SiMT systems that require no resource-intensive training or fine-tuning.