Poisoning and Policing: Towards Vision Data Integrity in Multimodal Retrieval-Augmented Generation Systems
Keywords: Multimodal RAG, Knowledge Injection, Data Integrity, Benchmark, MLLMs
Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) via external data integration but remains vulnerable to poisoning attacks targeting knowledge integrity. Existing work on knowledge injection focuses solely on textual payloads, failing to address vulnerabilities arising from vision inputs. In this paper, we bridge this research gap by presenting the first image-based injection attack targeting multimodal RAG systems. Based on two practical pipelines, we develop simple but effective attack methods by poisoning image-text data sources. Our methods achieve a successful attack rate ranging from 20% to 100% across various configurations, compelling the system to retrieve and/or generate malicious content. Recognizing the significant poisoning risk posed to multimodal RAG systems, we present IntDetEval, a novel multimodal benchmark for assessing the policing capability towards vision data integrity across multimodal LLMs (MLLMs). Experiments under both weakDet and strongDet settings expose serious policing deficiencies in MLLMs, with a false negative rate of over 27% even for Claude-Sonnet-4.5.
Paper Type: Long
Research Area: Retrieval-Augmented Language Models
Research Area Keywords: Multimodality and Language Grounding to Vision, Robotics and Beyond, Resources and Evaluation
Languages Studied: English
Submission Number: 1704
Loading