SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning

Cai Selvas-Sala, Lei Kang, Lluis Gomez

Published: 05 Jun 2026, Last Modified: 09 May 2026CVPR 2026 (Accepted, to appear)EveryoneCC BY 4.0

Abstract: As multimodal models like CLIP become integral to down- stream systems, the need to remove sensitive information is critical. However, machine unlearning for contrastively- trained encoders remains underexplored, and existing eval- uations fail to diagnose fine-grained, association-level for- getting. We introduce SALMUBench (Sensitive Association- Level Multimodal Unlearning), a benchmark built upon a synthetic dataset of 60K persona-attribute associations and two foundational models: a Compromised model polluted with this data, and a Clean model without it. To iso- late unlearning effects, both are trained from scratch on the same 400M-pair retain base, with the Compromised model additionally trained on the sensitive set. We pro- pose a novel evaluation protocol with structured holdout sets (holdout identity, holdout association) to precisely measure unlearning efficacy and collateral damage. Our benchmark reveals that while utility-efficient deletion is feasible, current methods exhibit distinct fail- ure modes: they either fail to forget effectively or over- generalize by erasing more than intended. SALMUBench sets a new standard for comprehensive unlearning evalua- tion, and we publicly release our dataset, models, evalua- tion scripts, and leaderboards to foster future research