MicroEdit: Neuron-level Knowledge Disentanglement and Localization in Lifelong Model Editing

MicroEdit: Neuron-level Knowledge Disentanglement and Localization in Lifelong Model Editing

ACL ARR 2025 May Submission6505 Authors

20 May 2025 (modified: 04 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) require continual knowledge updates to keep pace with the evolving world. While various model editing methods have been proposed, most face critical challenges in lifelong learning contexts due to two fundamental limitations: (1) Edit Overshooting - parameter updates intended for a specific fact spill over to unrelated regions, causing interference with previously retained knowledge; and (2) Knowledge Entanglement - polysemantic neurons' overlapping encoding of multiple concepts makes it difficult to isolate and edit a single fact. In this paper, we propose MicroEdit, a neuron-level editing method that performs minimal and controlled interventions within LLMs. By leveraging a sparse autoencoder (SAE), MicroEdit disentangles knowledge representations and activates only a minimal set of necessary neurons for precise parameter updates. This targeted design enables fine-grained control over the editing scope, effectively mitigating interference and preserving unrelated knowledge. Extensive experiments show that MicroEdit outperforms prior methods and robustly handles lifelong knowledge editing across QA and Hallucination settings on LLaMA and Mistral.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: model editing

Languages Studied: English

Submission Number: 6505

Loading