MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Knowledge Poisoning Attacks

Published: 02 Mar 2026, Last Modified: 05 Mar 2026ICLR 2026 Trustworthy AIEveryoneRevisionsBibTeXCC BY 4.0
Keywords: multimodal RAG, knowledge poisoning, multimodal large language models
TL;DR: We propose the first knowledge poisoning attack on multimodal RAG frameworks, revealing vulnerabilities posed by poisoned external knowledge bases.
Abstract: Retrieval Augmented Generation (RAG) has become a common practice in multimodal large language models (MLLM) to enhance factual grounding and reduce hallucination. The benefits of retrieving external texts and images, however, come with a cost: exposing the entire multimodal RAG framework to knowledge poisoning attacks. In such attacks, adversaries deliberately inject malicious multimodal content into external knowledge bases to steer models toward generating incorrect or even harmful responses. We present MM-PoisonRAG, the first framework to systematically study the vulnerability of multimodal RAG under knowledge poisoning. Specifically, we design two novel attack strategies: Localized Poisoning Attack (LPA), which implants targeted, query-specific multimodal misinformation to manipulate outputs toward attacker-controlled responses, and Globalized Poisoning Attack (GPA), which uses a single, untargeted adversarial injection to broadly corrupt reasoning and collapse generation quality across all queries. Extensive experiments on diverse tasks (e.g., MMQA, WebQA), multimodal RAG components (e.g., retriever, reranker, generator), and attacker access levels (e.g., from black-box to white-box) demonstrate the severity of these threats. LPA achieves up to 56% attack success rate even under restricted access, and demonstrates superior transferability, disrupting generations across four different retrievers without re-optimizing the adversaries. GPA completely disrupts model generation to 0% accuracy with just one poisoned content. Moreover, we show that both LPA and GPA bypass existing defenses, underscoring the fragility of multimodal RAG and establishing MM-PoisonRAG as a foundation for future research on safeguarding retrieval-augmented MLLMs against multimodal knowledge poisoning.
Submission Number: 111
Loading