Abstract: Knowledge editing (KE) methods offer an efficient way to modify knowledge in large language models. Current KE evaluations typically assess editing success by considering only the edited knowledge without any prefix contexts. In real-world applications, however, prefix contexts may trigger the retrieval of the original knowledge and undermine the intended edit. To address this issue, we have developed CHED—a benchmark designed to evaluate the context robustness of KE methods. Evaluations on CHED show that while current methods effectively edit knowledge without context, they often fail when prefix contexts are present. To mitigate this shortcoming, we introduce CoRE, a KE method designed to strengthen context robustness by minimizing context-sensitive variance in hidden states of the model for edited knowledge. This method not only improves the editing success rate in situations where a prefix context is present but also preserves the overall capabilities of the model. The source code and data will be released to the public upon the publication of the paper.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: corpus creation, NLP datasets, benchmarking, evaluation methodologies
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: English
Submission Number: 1665
Loading