Scalable Model Editing via Customized Expert Networks

Zihan Yao; Yu He; Tianyu Qi; Ming Li

Scalable Model Editing via Customized Expert Networks

Zihan Yao, Yu He, Tianyu Qi, Ming Li

Published: 10 Jul 2024, Last Modified: 05 Sept 2024COLMEveryoneRevisionsBibTeXCC BY 4.0

Research Area: Safety, Learning algorithms for LMs

Keywords: Large Language Model,Model Editing,Continual Learning

TL;DR: We propose a two-stage continual training paradigm for Model Editing, which can achieve state-of-the-art performance through the addition of extra expert networks and neurons in sequence.

Abstract: Addressing the issues of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a costeffective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on non-edited samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding indexing neuron for each expert to control the activation state of that expert. We conducted a series of experiments on the ZsRE and Hallucination benchmarks by tuning the advanced open-source LLM, Llama2, achieving state-of-theart results compared to current mainstream methods. Our code is available at https://github.com/TAL-auroraX/SCEN.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 378

Loading