Obscuring Data Contamination Through Translation: Evidence from Arabic Corpora

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Data contamination, translation-induced leakage, multilingual evaluation reliability
TL;DR: Translating English benchmarks into another language can mask, but not remove, training leaks; simple, label-preserving perturbations reveal hidden contamination and inflate reported accuracy.
Abstract: Data contamination threatens the validity of Large Language Model (LLM) evaluation by allowing models to exploit memorized benchmark content rather than demonstrating true generalization. While existing detection methods focus primarily on English datasets, little is known about how contamination manifests in multilingual contexts. In this paper, we investigate contamination dynamics by fine-tuning several open-weight LLMs on varying proportions of different Arabic datasets and evaluating them on the original English benchmarks. To probe memorization, we extend the Tested Slot (TS)-Guessing method with a choice-reordering strategy, enabling us to disentangle genuine reasoning from contaminated recall. Our results show that while translation into Arabic conceals traditional contamination signals, models still benefit from exposure to contaminated data, particularly those with stronger Arabic capabilities. This demonstrates that translations can mask but not eliminate contamination, creating a dangerous blind spot in current evaluation practices. To address this, we propose a Translation-Aware Contamination Detection framework, which checks contamination across multiple translated versions of benchmarks rather than relying on English alone. Together, our findings highlight the need for multilingual contamination-aware pipelines to ensure fair, transparent, and reproducible evaluation of LLMs.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 24550
Loading