BABELEDITS: A Benchmark and a Modular Approach for Robust Cross-lingual Knowledge Editing of Large Language Models
Abstract: With Large Language Models (LLMs) becoming increasingly multilingual, effective knowledge editing (KE) needs to propagate edits across languages. Evaluation of the existing methods for cross-lingual knowledge editing (CKE) is limited both w.r.t. edit effectiveness: benchmarks do not account for entity aliases and use faulty entity translations; as well as robustness: existing work fails to report on LLMs' downstream generation and task-solving abilities after editing.
In this work, we aim to (i) maximize the effectiveness of CKE while at the same time (ii) minimizing the extent of downstream model collapse due to the edits. To accurately measure the effectiveness of CKE methods, we introduce BabelEdits, a new CKE dataset covering 60 languages that combines high-quality multilingual synsets from BabelNet with marker-based translation to ensure entity translation quality. Unlike existing CKE benchmarks, BabelEdits accounts for the rich variety of entity aliases within and across languages.
We then propose BabelReFT, a modular CKE approach based on representation fine-tuning (ReFT) which learns entity-scope ReFT modules, applying them to all multilingual aliases at inference. Our experimental results show that not only is BabelReFT more effective in CKE than state-of-the-art methods, but, owing to its modular design, much more robust against downstream model collapse when subjected to many sequential edits.
Paper Type: Long
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: cross-lingual transfer,robustness,knowledge-augmented methods,multilingual benchmarks,multilingual evaluation,resources for less-resourced languages
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: Azerbaijani,Belarussian,Bulgarian,Bengali,Catalan,Czech,Danish,German,Greek,English,Spanish,Estonian,Basque,Persian,Finnish,French,Gujarati,Hebrew,Hindi,Croatian,Haitian,Hungarian,Armenian,Indonesian,Italian,Japanese,Javanese,Georgian,Kazakh,Korean,Lithuanian,Malayalam,Marathi,Malay,Burmese,Dutch,Norwegian,Punjabi,Polish,Portuguese,Quechua,Romanian,Russian,Slovak,Swedish,Serbian,Swahili,Tamil,Telugu,Thai,Tagalog,Turkish,Ukrainian,Urdu,Uzbek,Vietnamese,Yoruba,Chinese
Submission Number: 6704
Loading