Unlearning as Multi-Task Optimization: a normalized gradient difference approach with adaptive learning rate
Keywords: machine unlearning, multi-task optimization, learning rate scheduler, large language models
TL;DR: We formulate the unlearning as two-task optimization (forgetting and retaining), for which we apply the normalized gradient difference and automatic learning rate schedule.
Abstract: Unlearning techniques have been proposed as a cost-effective post-training way to remove undesired knowledge learned by large language models (LLMs). However, existing methods often fail to effectively unlearn the targeted information or cause a significant drop in model performance. In this paper, we frame machine unlearning as a multi-task optimization problem to balance this tradeoff -- one task maximizes forgetting loss, while the other minimizes retaining loss. We introduce a novel unlearning method, Normalized Gradient Difference (NGDiff), which guarantees Pareto optimality upon convergence. Specifically, NGDiff dynamically normalizes task gradients, enabling the model to unlearn targeted forgetting data while preserving utility on the retaining set. We also identified that unlearning methods are sensitive to learning rate and integrate an automatic learning rate scheduler that selects the locally optimal learning rate to stabilize and accelerate the convergence. Experiments with various LLMs demonstrate that NGDiff outperforms state-of-the-art unlearning methods on the TOFU and MUSE datasets.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4994
Loading