FAME: Factual Multi-task Model Editing BenchmarkDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Large language models (LLMs) possess the capability to retain a wide range of knowledge, albeit they also show tendencies for factual inaccuracies. To rectify such inaccuracies without the necessity for costly model retraining, a variety of model editing approaches have been proposed, aiming to correct these inaccuracies in a more cost-efficient way. To evaluate these model editing methods, previous work introduced a series of datasets. However, most of these datasets use fabricated data, rendering them incapable of evaluating or improving the capabilities of models. Additionally, they only included a single task, preventing them from comprehensively simulating the real world. To resolve these challenges and effectively enhance the capabilities of LLMs, we present FAME (FActual Multi-task model Editing), an authentic, comprehensive, and multi-task dataset, which is designed to amplify the practicality of model editing. We then propose SKEME (Structured Knowledge retrieved by Exact Matching and reranking Editing), a model editing technique predicated on structured knowledge retrieval. The experiments demonstrate that our method performs excellently across various tasks and scenarios, confirming its practicality.
Paper Type: long
Research Area: Resources and Evaluation
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview