Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing

Hanlun Zhu; Yunshi Lan; Xiang Li; Weining Qian

Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing

Hanlun Zhu, Yunshi Lan, Xiang Li, Weining Qian

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: natural language processing, model editing, language model, key-value adaptor

Abstract: As the insight of knowledge storage in language models deepens, the ability to perform CRUD (Create, Read, Update, Delete) operations on language models becomes increasingly indispensable for satisfying the demands of managing rapidly updating knowledge. Considering the high cost of fine-tuning language models, model editing methods with low cost are usually required to manipulate models’ knowledge. Evident suggests that modules carrying knowledge in a Transformer module are primarily the MLP blocks, thus we propose iReVa, a method that explicitly initializes and retrofits key-value pairs into MLP blocks to construct a new mapping of a piece of knowledge without damaging the irrelevant knowledge. In comparison to existing methods, iReVa reveals better interpretability and a stronger capacity for carrying traceable edits. Experiment results on a series of GPT series models show our prominent performance on edit success and generalization without influencing specificity. We also made the first attempt to conduct a knowledge withdrawal test of iReVa. Our codes are available on this website.

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6144

Loading