Efficient and Stable Lifelong Knowledge Editing in LLMs via Neuron-Level Interventions

Efficient and Stable Lifelong Knowledge Editing in LLMs via Neuron-Level Interventions

ACL ARR 2025 May Submission194 Authors

08 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Existing "locate-then-edit" approaches, which identify and perturb key parameters, often struggle in sequential editing scenarios, leading to overfitting, catastrophic forgetting, or model collapse. This paper introduces the Precise Neuron-Level Knowledge Editing (PNKE) framework, designed for efficient, low-interference knowledge updates via fine-grained neuron-level interventions. PNKE employs causal attribution to pinpoint background and trigger neurons tied to target knowledge, followed by an entropy-guided sparse masking mechanism to select a critical neuron subset for targeted parameter updates. Our PNKE ensures editing precision while dynamically adjusting sparsity to maintain model stability during lifelong editing. In extensive lifelong editing experiments, PNKE outperforms state-of-the-art methods, achieving an editing success rate (Rel.) of 0.936, generalization (Gen.) of 0.891, and locality (Loc.) of 0.952 on benchmarks like ZsRE and CounterFact. After 5,000 edits, PNKE sustains robust performance on tasks such as MMLU and GSM8K, underscoring its stability and practical utility for continuous knowledge integration in LLMs.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: feature attribution

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 194

Loading