Strengthening LLM Trust Boundaries: A Survey of Prompt Injection Attacks Surender Suresh Kumar Dr. M.L. Cummings Dr. Alexander Stimpson

Published: 01 Jan 2024, Last Modified: 13 May 2025ICHMS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Both academia and industry have embraced Large Language Models (LLMs) as new tools to extend or create new capabilities in human-artificial intelligence (AI) interactions. This rise of LLMs has also led to an impressive increase in LLM prompt injections, which are human-driven attacks meant to manipulate the LLMs to operate outside the safe boundaries and guardrails originally designed to prevent such behaviors. This effort presents a cohesive framework that organizes various prompt injection attacks with respect to the type of prompt used in attacks, the type of trust boundary the attacks violated, and the level of expertise required to carry out such attacks. Analysis of this framework leads to recommendations for how trust boundaries could be further strengthened through a combination of sociotechnical approaches.
Loading