Assessing and Post-Processing Black Box Large Language Models for Knowledge Editing

ACL ARR 2024 June Submission

15 Jun 2024
Abstract: The task of Knowledge Editing (KE) is aimed at efficiently and precisely adjusting the behavior of large language models (LLMs) to update specific knowledge while minimizing any adverse effects on other knowledge. Current research predominantly concentrates on editing white-box LLMs, neglecting a significant scenario: editing black-box LLMs, where access is limited to interfaces and only textual output is provided. In this paper, we initially officially introduce KE on black-box LLMs, followed by presenting a thorough evaluation framework aimed at addressing the shortcomings of current evaluations, which are inadequate for black-box LLMs editing and lack comprehensiveness. To address privacy leaks of editing data and style over-editing in existing approaches, we propose a new postEdit framework, ensuring privacy through downstream processing and maintaining textual style consistency via fine-grained editing. Experiments and analysis conducted on two benchmarks show that postEdit surpasses all baselines and exhibits robust generalization, notably enhancing style retention by an average of +20.82\%.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: fact checking,knowledge graphs
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
