Added Toxicity Mitigation at Inference Time for Multimodal and Massively Multilingual Translation

Published: 01 Jan 2024, Last Modified: 07 Oct 2024EAMT (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Machine translation models sometimes lead to added toxicity: translated outputs may contain more toxic content that the original input. In this paper, we introduce MinTox, a novel pipeline to automatically identify and mitigate added toxicity at inference time, without further model training. MinTox leverages a multimodal (speech and text) toxicity classifier that can scale across languages.We demonstrate the capabilities of MinTox when applied to SEAMLESSM4T, a multi-modal and massively multilingual machine translation system. MinTox significantly reduces added toxicity: across all domains, modalities and language directions, 25% to95% of added toxicity is successfully filtered out, while preserving translation quality
Loading