Towards Safe and Secure LLM Systems: Defense Strategies and Assessment Against Inference-Time Security Risks
Abstract: LLMs (Large Language Models) power diverse applications from creative content generation to human-like usage of the computer, while still being stochastic and unreliable, which creates potential security loopholes in any system that uses them. In our work, we propose a complete flow of security risk management for LLM systems, including cross-stream mitigation strategies and their incorporation into the development life cycle. Within our case study, we evaluate cloud-based and custom guardrails in the context of RAG (Retrieval Augmented Generation) based systems, proving the instrumental role of guardrails in achieving lower Attack Success Rates for RAG systems and the need for a holistic risk mitigation framework to cover all security risk categories. Our work aims to showcase best practices for trustworthy inferences contributing to more secure and safer LLM-based applications.
Loading