ThinkEval: Practical Evaluation of Knowledge Leakage in LLM Editing using Thought-based Knowledge Graphs

Manit Baser; Dinil Mon Divakaran; Mohan Gurusamy

ThinkEval: Practical Evaluation of Knowledge Leakage in LLM Editing using Thought-based Knowledge Graphs

Manit Baser, Dinil Mon Divakaran, Mohan Gurusamy

Published: 19 Jan 2026, Last Modified: 19 Jan 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Robust model-editing techniques are essential for deploying large language models (LLMs) in practical applications, as they enable cost-effective ways to deal with challenges such as privacy breaches, bias mitigation and misinformation spread. For example, an LLM-based healthcare assistance may need to update out-dated or incorrect knowledge to prevent harmful recommendations. However, many editing techniques focus on isolated facts, which critically fail to prevent indirect knowledge leakage---the unintended reconstruction of edited-out information through persistent causal links and contextual relationships. To assist users in selecting the right editing technique, we develop and present ThinkEval, a framework to systematically quantify indirect knowledge leakage and ripple effects in model-editing. ThinkEval builds and employs specialized knowledge graphs to analyze the causal structure of facts before and after editing. To support this approach, we present KnowGIC, a benchmark dataset comprising multi-step reasoning paths that precisely measure these complex knowledge transformation effects. We evaluate five editing techniques: AlphaEdit, RECT, ROME, MEMIT, and PRUNE across multiple LLMs. Our results show that these techniques struggle to balance indirect fact suppression with the preservation of related knowledge, compromising the contextual integrity of a model's knowledge. Our dataset is available at: https://github.com/manitbaser/KnowGIC.

Submission Type: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Dear Action Editor and Reviewers, We would like to express our sincere gratitude for your careful evaluation and constructive feedback, as well as for the decision to accept the paper. Since the previous submission, we have fixed minor typographical issues in the main text that were noted by the Reviewers. In addition, we have incorporated all requested clarifications, analyses, and discussions in the Appendix (from Section I onwards), including further details on evaluation methodology, robustness analyses, reproducibility, and responsible model editing considerations. We are grateful for the Reviewers' thoughtful comments, which have helped us improve the clarity and overall quality of our manuscript.

Code: https://github.com/manitbaser/KnowGIC

Assigned Action Editor: ~Bamdev_Mishra1

Submission Number: 5815

Loading