Assessing and Post-Processing Black Box Large Language Models for Knowledge Editing

Assessing and Post-Processing Black Box Large Language Models for Knowledge Editing

ACL ARR 2024 June Submission2600 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The task of Knowledge Editing (KE) is aimed at efficiently and precisely adjusting the behavior of large language models (LLMs) to update specific knowledge while minimizing any adverse effects on other knowledge. Current research predominantly concentrates on editing white-box LLMs, neglecting a significant scenario: editing black-box LLMs, where access is limited to interfaces and only textual output is provided. In this paper, we initially officially introduce KE on black-box LLMs, followed by presenting a thorough evaluation framework aimed at addressing the shortcomings of current evaluations, which are inadequate for black-box LLMs editing and lack comprehensiveness. To address privacy leaks of editing data and style over-editing in existing approaches, we propose a new postEdit framework, ensuring privacy through downstream processing and maintaining textual style consistency via fine-grained editing. Experiments and analysis conducted on two benchmarks show that postEdit surpasses all baselines and exhibits robust generalization, notably enhancing style retention by an average of +20.82\%.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: fact checking,knowledge graphs

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 2600

Loading