BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models

ACL ARR 2024 December Submission1722 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by retrieving relevant information from external knowledge bases to provide more accurate, contextually informed, and up-to-date responses. However, this reliance on external knowledge introduces significant security vulnerabilities. In this paper, we unveil a novel backdoor threat in which attackers exploit the openness of these knowledge bases by injecting malicious passages. This threat is both realistic and severe, as many RAG systems (e.g., Google Search) rely on large and unsanitized data repositories (e.g., Reddit). We propose BadRAG, a backdoor attack that employs a two-stage malicious passage optimization framework specifically designed to exploit this vulnerability. First, malicious passages are optimized to be retrieved exclusively when specific trigger words appear in user queries. Second, these passages are meticulously crafted to achieve adversarial generation objectives, including denial of service, sentiment manipulation, privacy violations, and tool misuse. Notably, BadRAG operates solely by injecting several malicious passages into the external knowledge base, demonstrating that RAG’s corpora can serve as an effective backdoor carrier without any need to modify the weights of RAG's retriever or generator. Our experiments show that injecting just 10 malicious passages (0.04\% of the external corpora) achieves a 98.2\% retrieval success rate and increases negative response rates from 0.22\% to 72\% for targeted queries.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: Ethics, Bias, and Fairness, Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 1722
Loading