Tool Unlearning for Tool Augmented LLMs

Jiali Cheng; Hadi Amiri

Tool Unlearning for Tool Augmented LLMs

Jiali Cheng, Hadi Amiri

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Tool-Augmented LLM, Machine Unlearning, Tool Unlearning

TL;DR: We propose a novel machine unlearning task -- Tool Unlearning, where learned tools are unlearned from tool-augmented LLMs, and an effective unlearning method.

Abstract: Tool-augmented large language models (LLMs) may need to forget learned tools due to security concerns, privacy restrictions, or deprecated tools. However, unlearning tool has not been explored in prior machine unlearning works. We propose tool unlearning, a novel machine unlearning task that deletes already acquired tools. Compared to traditional unlearning, tool unlearning exhibits certain differences and difficulties: 1) knowledge removal instead of forgetting samples, 2) significant cost of optimizing LLMs, 3) lack of principled evaluation tools. To bridge this gap, we introduce three properties for effective tool unlearning and propose ToolDelete, the first unlearning method designed for tool-augmented LLMs. We also propose the first membership inference attack (MIA) model for evaluating tool unlearning. Experiments on three tool learning datasets and tool-augmented LLMs demonstrate that ToolDelete effectively unlearns both randomly selected tools and tools from specific categories. The unlearning behavior does not impact the LLM's knowledge on non-deleted tools, while preserving performances on other general tasks.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12394

Loading