Abstract: Personal digital data is a critical asset, and governments worldwide have enforced laws and regulations to protect data privacy. Data users have been endowed with the “right to be forgotten” (RTBF) of their data. In the course of machine learning (ML), the forgotten right requires a model provider to delete user data and its subsequent impact on ML models upon user requests. Machine unlearning (MU) emerges to address this, which has garnered ever-increasing attention from both industry and academia. Specifically, MU allows model providers to eliminate the influence of unlearned data without retraining the model from scratch, ensuring the model behaves as if it never encountered this data. While the area has developed rapidly, there is a lack of comprehensive surveys to capture the latest advancements. Recognizing this shortage, we conduct an extensive exploration to map the landscape of MU including the (fine-grained) taxonomy of unlearning algorithms under centralized and distributed settings, debate on approximate unlearning, verification and evaluation metrics, and challenges and solutions across various applications. We also focus on the motivations, challenges, and specific methods for deploying unlearning in large language models (LLMs), as well as the potential attacks targeting unlearning processes. The survey concludes by outlining potential directions for future research, hoping to serve as a beacon for interested scholars.
External IDs:dblp:journals/tnn/LiZGCZKF25
Loading