Long-form evaluation of model editing

Anonymous

Long-form evaluation of model editing

Anonymous

16 Dec 2023 (modified: 20 Dec 2023)ACL ARR 2023 December Blind SubmissionReaders: Everyone

TL;DR: We develop a novel evaluation protocol to understand the impact of Model Editing interventions on long-form paragraph-length generated text.

Abstract: Evaluation of model editing methods currently only use the `next few token' completions after a prompt. As a result, the impact of these interventions on longer natural language generation is largely unknown. We introduce long-form evaluation of model editing (\textbf{\textit{LEME}}) a novel evaluation protocol that measures the efficacy and impact of model editing in long-form generative settings. Our protocol consists of a machine-rated survey and classifier which correlate well with human ratings. Importantly, we find that our protocol has almost no relationship with previous short-form metrics (despite being designed to extend efficacy, generalization, locality, portability into a long-form setting), indicating that our method introduces a novel set of dimensions for understanding model editing methods. Using this protocol, we benchmark a number of model editing techniques and present several findings including that, while some methods (ROME and MEMIT) perform well on making consistent edits within limited scope, they suffer much more from factual drift than other methods. Finally, we present a qualitative analysis that illustrates common failure modes in long-form generative settings including internal consistency, lexical cohesion, and locality issues.

Paper Type: long

Research Area: Resources and Evaluation

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources, Data analysis

Languages Studied: English

0 Replies

Loading