MIND Your Language: A Multilingual Dataset for Cross-Lingual News Recommendation (Extended Abstract)
Abstract: We present xMIND, an open, multilingual news recommendation dataset derived from the English MIND dataset using machine translation, covering 14 linguistically and geographically diverse languages, with digital footprints of varying sizes. Using xMIND, we benchmark several content-based neural news recommenders (NNRs) in zero-shot (ZS-XLT) and few-shot (FS-XLT) cross-lingual transfer scenarios, considering both monolingual and bilingual news consumption patterns. Our findings reveal that (i) current NNRs, even when based on a multilingual language model, suffer from substantial performance losses under ZS-XLT and that (ii) inclusion of target-language data in FS-XLT training has limited benefits, particularly when combined with bilingual news consumption. We release xMIND at https://github.com/andreeaiana/xMIND.
External IDs:dblp:conf/ki/IanaGP24
Loading