OFFSIDE: Benchmarking Unlearning Misinformation in Multimodal Large Language Models

OFFSIDE: Benchmarking Unlearning Misinformation in Multimodal Large Language Models

ACL ARR 2026 January Submission801 Authors

25 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multimodal machine unlearning; benchmark; Fine-grained unlearning; Visual rumors

Abstract: Advances in Multimodal Large Language Models (MLLMs) intensify concerns about data safety, making Machine Unlearning (MU), the selective removal of harmful/private information, a critical necessity. However, existing MU benchmarks for MLLMs are limited by a lack of image diversity, coarse-grained unlearning target, and insufficient evaluation scenarios, which fail to capture the complexity of real-world applications. To facilitate the development of MLLMs unlearning and alleviate the aforementioned limitations, we introduce OFFSIDE, a novel benchmark for evaluating misinformation unlearning in MLLMs. This manually curated dataset contains 15.68K records for 80 players, providing a comprehensive framework with four test sets to assess forgetting efficacy, generalization, utility, and robustness. OFFSIDE supports advanced unlearning targets, such as fine-grained unlearning and visual rumor removal. Our extensive evaluation of multiple baselines not only extends key findings from LLM MU to MLLM MU: (1) unlearned rumors can be easily recovered through relearning and (2) all methods are vulnerable to prompt attacks, but also introduces novel insights in the context of MLLM: (1) unimodal methods fail to handle multimodal rumors, (2) unlearning efficacy is primarily driven by catastrophic forgetting statistically, and (3) all methods struggle with visual rumors (rumors embedded in images). These results expose significant vulnerabilities in current approaches, highlighting the need for more robust multimodal unlearning solutions.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: benchmarking, evaluation

Contribution Types: Model analysis & interpretability, Data resources

Languages Studied: English

Submission Number: 801

Loading