Abstract: Machine unlearning is used to mitigate the privacy risks of Large Vision-Language Models (LVLMs) arising from training on large-scale web data. However, existing unlearning methods often fail to carefully select substitute outputs for forget targets, resulting in Unlearning Aftermaths—undesirable behaviors such as degenerate, hallucinated, or excessively refused responses. We highlight that, especially for generative LVLMs, it is crucial to consider the quality and informativeness of post-unlearning responses rather than relying solely on naive suppression. To address this, we introduce a new unlearning task for LVLMs that requires models to provide privacy-preserving yet informative and visually grounded responses. We also propose PUBG, a novel unlearning method that explicitly guides post-unlearning behavior toward a desirable output distribution. Experiments show that, while existing methods suffer from Unlearning Aftermaths despite successfully preventing privacy violations, PUBG effectively mitigates these issues, generating visually grounded and informative responses without privacy leakage for forgotten targets.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: multimodal applications, security/privacy, NLP for social good
Contribution Types: NLP engineering experiment
Languages Studied: English
Previous URL: https://openreview.net/forum?id=6FlnTZ3jFB
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: The previous reviewer was unable to properly engage in the review discussion, as they did not clearly understand our mathematical formulations.
Software: zip
Data: zip
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: in Ethics Statements section.
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 4
B2 Discuss The License For Artifacts: N/A
B3 Artifact Use Consistent With Intended Use: N/A
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Section 4
B5 Documentation Of Artifacts: N/A
B5 Elaboration: N/A
B6 Statistics For Data: Yes
B6 Elaboration: Section 4
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Appendix C
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Appendix C
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 4
C4 Parameters For Packages: Yes
C4 Elaboration: Section 4
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1331
Loading