Poisoning and Policing: Towards Vision Data Integrity in Multimodal Retrieval-Augmented Generation Systems

Poisoning and Policing: Towards Vision Data Integrity in Multimodal Retrieval-Augmented Generation Systems

ACL ARR 2026 January Submission1704 Authors

31 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multimodal RAG, Knowledge Injection, Data Integrity, Benchmark, MLLMs

Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) via external data integration but remains vulnerable to poisoning attacks targeting knowledge integrity. Existing work on knowledge injection focuses solely on textual payloads, failing to address vulnerabilities arising from vision inputs. In this paper, we bridge this research gap by presenting the first image-based injection attack targeting multimodal RAG systems. Based on two practical pipelines, we develop simple but effective attack methods by poisoning image-text data sources. Our methods achieve a successful attack rate ranging from 20% to 100% across various configurations, compelling the system to retrieve and/or generate malicious content. Recognizing the significant poisoning risk posed to multimodal RAG systems, we present IntDetEval, a novel multimodal benchmark for assessing the policing capability towards vision data integrity across multimodal LLMs (MLLMs). Experiments under both weakDet and strongDet settings expose serious policing deficiencies in MLLMs, with a false negative rate of over 27% even for Claude-Sonnet-4.5.

Paper Type: Long

Research Area: Retrieval-Augmented Language Models

Research Area Keywords: Multimodality and Language Grounding to Vision, Robotics and Beyond, Resources and Evaluation

Languages Studied: English

Submission Number: 1704

Loading