MIND Your Language: A Multilingual Dataset for Cross-Lingual News Recommendation (Extended Abstract)

Published: 2024, Last Modified: 04 Mar 2026KI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We present xMIND, an open, multilingual news recommendation dataset derived from the English MIND dataset using machine translation, covering 14 linguistically and geographically diverse languages, with digital footprints of varying sizes. Using xMIND, we benchmark several content-based neural news recommenders (NNRs) in zero-shot (ZS-XLT) and few-shot (FS-XLT) cross-lingual transfer scenarios, considering both monolingual and bilingual news consumption patterns. Our findings reveal that (i) current NNRs, even when based on a multilingual language model, suffer from substantial performance losses under ZS-XLT and that (ii) inclusion of target-language data in FS-XLT training has limited benefits, particularly when combined with bilingual news consumption. We release xMIND at https://github.com/andreeaiana/xMIND.
Loading