Explainable and Efficient Editing for Large Language Models

Tianyu Zhang; Junfeng Fang; Houcheng Jiang; Baolong Bi; Xiang Wang; Xiangnan He

Explainable and Efficient Editing for Large Language Models

Tianyu Zhang, Junfeng Fang, Houcheng Jiang, Baolong Bi, Xiang Wang, Xiangnan He

Published: 29 Jan 2025, Last Modified: 29 Jan 2025WWW 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Semantics and knowledge

Keywords: Large Language Models, Knowledge Editing, Model Explainability, Question Answering

TL;DR: Explainable and Efficient Sequential Model Editing for Large Language Models

Abstract: Large Language Models (LLMs) possess remarkable capabilities in storing and retrieving vast factual knowledge but often retain outdated or incorrect information from web corpora. While full retraining is costly, locate-and-edit model editing methods offer an feasible alternative. Current methods typically follow a two-stage paradigm: (1) identifying critical layers for knowledge storage and (2) updating their parameters to store new knowledge. However, both of these two phases have their inherent limitations. In stage 1, layers identification is independent of the to-be-updated knowledge, ignoring the varying storage patterns of different knowledge types. Meanwhile, Stage 2 suffers from high computational overhead due to independent gradient descent for each piece of knowledge. To solve these, we propose an Explainable and effiCient model Editing method, termed ECE. Specifically, in Stage 1, ECE integrates the concept of LLMs explainability into the editing process, enabling the adaptive identification of the crucial neurons based on the input knowledge. In Stage 2, ECE clusters similar knowledge based on the explanation results, allowing batch optimization in a single gradient step, significantly reducing time consumption without sacrificing effectiveness. Extensive experiments demonstrate that ECE can achieve superior performance while delivering a 3.27× speedup in editing efficiency, showcasing the potential of explainability-driven editing methods for LLMs.

Submission Number: 317

Loading