Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Tianci Liu; Ruirui Li; Zihan Dong; Hui Liu; Xianfeng Tang; Qingyu Yin; Linjun Zhang; Haoyu Wang; Jing Gao

Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Tianci Liu, Ruirui Li, Zihan Dong, Hui Liu, Xianfeng Tang, Qingyu Yin, Linjun Zhang, Haoyu Wang, Jing Gao

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) have achieved remarkable performance on various natural language tasks. However, they are trained on static corpora and their knowledge can become outdated quickly in the fast-changing world. This motivates the development of knowledge editing (KE) to update specific knowledge in LLMs without changing unrelated others or compromising their pre-trained capabilities. Previous efforts sought to update a small amount of parameters of a LLM and proved effective for making selective updates. Nonetheless, the edited LLM often exhibits degraded ability to reason about the new knowledge. In this work, we identify a key issue: \textit{heterogeneous token overfitting} (HTO), where the LLM overfits different tokens in the provided knowledge at varying rates. To tackle this, we propose {OVERTONE}, a token-level smoothing method that mitigates HTO by adaptively refining the target distribution. Theoretically, OVERTONE offers better parameter updates with negligible computation overhead. It also induces an implicit DPO but does not require preference data pairs. Extensive experiments across four editing methods, two LLMs, and diverse scenarios demonstrate the effectiveness and versatility of our method.

Lay Summary: Powerful AI language tools (LLMs) are like highly knowledgeable students, but their information can quickly become outdated in our fast-changing world. Our research focuses on "knowledge editing" – the challenge of efficiently updating these AIs with new facts, without confusing them or causing them to forget what they already know. Previous attempts to edit an AI's knowledge often resulted in the AI memorizing new information too rigidly. This is a problem we call "heterogeneous token overfitting" (HTO), where the AI overfits specific words in the new fact, hindering its ability to truly understand and reason with that information. We develop a new technique called "OVERTONE." It acts as a smarter teaching method, helping the AI learn new facts more flexibly and smoothly. OVERTONE ensures the AI doesn't just memorize words but can connect new knowledge with its existing understanding. Importantly, our method is efficient and significantly improves the AI's ability to use new information effectively, making these powerful tools more reliable and up-to-date.

Primary Area: Deep Learning->Large Language Models

Keywords: Knowledge Editing, Large Language Models

Submission Number: 11933

Loading