Lifting Data-Tracing Machine Unlearning to Knowledge-Tracing for Foundation Models

Published: 27 May 2026, Last Modified: 27 May 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Machine unlearning removes certain training data points and their influence from AI models (e.g., when a data owner revokes their consent to allow models to learn from the data). In this position paper, we propose to lift data-tracing machine unlearning to knowledge-tracing for foundation models (FMs). We support this position based on practical needs and insights from cognitive studies. Practically, tracing data cannot meet the diverse unlearning requests for FMs, which may be from regulators, enterprise users, product teams, etc., who have no access to FMs' massive training data. Instead, it is convenient for these parties to issue an unlearning request about the knowledge or capability FMs (should not) possess. Cognitively, knowledge-tracing unlearning aligns with how the human brain forgets more closely than tracing individual training data points does. We further discuss the nontrivial challenges in the knowledge-tracing machine unlearning paradigm. Finally, we provide a concrete case study about a vision-language FM to illustrate how an unlearner might instantiate the knowledge-tracing machine unlearning paradigm. Code is available at: https://1yuwen.github.io/Knowledge-Tracing-MU-Page.
Submission Length: Regular submission (no more than 12 pages of main content)
Code: https://github.com/1yuwen/Knowledge-Tracing-MU
Assigned Action Editor: ~Tianbao_Yang1
Submission Number: 6066
Loading