MUNCH: A Multitask Unlearning Benchmark for LLMs

Anil Ramakrishna; Yixin Wan; Xiaomeng Jin; Kai-Wei Chang; Zhiqi Bu; Bhanukiran Vinzamuri; Volkan Cevher; Mingyi Hong; Rahul Gupta

MUNCH: A Multitask Unlearning Benchmark for LLMs

Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, Rahul Gupta

Published: 13 Jan 2025, Last Modified: 26 Feb 2025AAAI 2025 PDLM PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Unlearning

Abstract: Unlearning aims to remove copyrighted, sensitive, or private content from large language models (LLMs) without a full retraining. In this work, we develop a multi-task unlearning benchmark (\textsc{Munch}) that features three tasks: (1) unlearn synthetically generated creative short novels, (2) unlearn synthetic biographies with sensitive information, and (3) unlearn a collection of public biographies. We further release two fine-tuned LLMs of 1B and 7B parameter sizes as the target models. We conduct detailed evaluations of several recently-proposed algorithms and present results on carefully crafted metrics to understand their behavior and limitations.

Submission Number: 22

Loading