Keywords: Multimodal machine unlearning; benchmark; Fine-grained unlearning; Visual rumors
Abstract: Advances in Multimodal Large Language Models (MLLMs) intensify concerns about data safety, making Machine Unlearning (MU), the selective removal of harmful/private information, a critical necessity.
However, existing MU benchmarks for MLLMs are limited by a lack of image diversity, coarse-grained unlearning target, and insufficient evaluation scenarios, which fail to capture the complexity of real-world applications.
To facilitate the development of MLLMs unlearning and alleviate the aforementioned limitations, we introduce OFFSIDE, a novel benchmark for evaluating misinformation unlearning in MLLMs.
This manually curated dataset contains 15.68K records for 80 players, providing a comprehensive framework with four test sets to assess forgetting efficacy, generalization, utility, and robustness.
OFFSIDE supports advanced unlearning targets, such as fine-grained unlearning and visual rumor removal.
Our extensive evaluation of multiple baselines not only extends key findings from LLM MU to MLLM MU: (1) unlearned rumors can be easily recovered through relearning and (2) all methods are vulnerable to prompt attacks, but also introduces novel insights in the context of MLLM: (1) unimodal methods fail to handle multimodal rumors, (2) unlearning efficacy is primarily driven by catastrophic forgetting statistically, and (3) all methods struggle with visual rumors (rumors embedded in images).
These results expose significant vulnerabilities in current approaches, highlighting the need for more robust multimodal unlearning solutions.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: benchmarking, evaluation
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: English
Submission Number: 801
Loading