MUNCH: A Multitask Unlearning Benchmark for LLMs

Published: 13 Jan 2025, Last Modified: 26 Feb 2025AAAI 2025 PDLM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, Unlearning
Abstract:

Unlearning aims to remove copyrighted, sensitive, or private content from large language models (LLMs) without a full retraining. In this work, we develop a multi-task unlearning benchmark (\textsc{Munch}) that features three tasks: (1) unlearn synthetically generated creative short novels, (2) unlearn synthetic biographies with sensitive information, and (3) unlearn a collection of public biographies. We further release two fine-tuned LLMs of 1B and 7B parameter sizes as the target models. We conduct detailed evaluations of several recently-proposed algorithms and present results on carefully crafted metrics to understand their behavior and limitations.

Submission Number: 22
Loading