On the Vulnerability of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains
Keywords: Healthcare; Safety; RAG
Abstract: Retrieval-Augmented Generation (RAG) has been empirically shown to enhance
the performance of large language models (LLMs) in knowledge-intensive domains
such as healthcare, finance, and legal contexts. Given a query, RAG retrieves
relevant documents from a corpus and integrates them into the LLMs’ generation
process. In this study, we investigate the adversarial robustness of RAG, focusing
specifically on examining the retrieval system. First, across 225 different setup
combinations of corpus, retriever, query, and targeted information, we show that
retrieval systems are vulnerable to universal poisoning attacks in medical Q&A. In
such attacks, adversaries generate poisoned documents containing a broad spectrum
of targeted information, such as personally identifiable information. When these
poisoned documents are inserted into a corpus, they can be accurately retrieved
by any users, as long as attacker-specified queries are used. To understand this
vulnerability, we discovered that the deviation from the query’s embedding to that
of the poisoned document tends to follow a pattern in which the high similarity
between the poisoned document and the query is retained, thereby enabling precise
retrieval. Based on these findings, we develop a new detection-based defense to
ensure the safe use of RAG. Through extensive experiments spanning various Q&A
domains, we observed that our proposed method consistently achieves excellent
detection rates in nearly all cases.
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13014
Loading