TrojanSQL: SQL Injection against Natural Language Interface to Database

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX
Submission Type: Regular Long Paper
Submission Track: Ethics in NLP
Submission Track 2: NLP Applications
Keywords: text-to-SQL, NLIDB, security, SQL Injection, NL2Code
TL;DR: We propose a novel attack paradigm against natural language interfaces to databases (NLIDB).
Abstract: The technology of text-to-SQL has significantly enhanced the efficiency of accessing and manipulating databases. However, limited research has been conducted to study its vulnerabilities emerging from malicious user interaction. By proposing TrojanSQL, a backdoor-based SQL injection framework for text-to-SQL systems, we show how state-of-the-art text-to-SQL parsers can be easily misled to produce harmful SQL statements that can invalidate user queries or compromise sensitive information about the database. The study explores two specific injection attacks, namely $\textit{boolean-based injection}$ and $\textit{union-based injection}$, which use different types of triggers to achieve distinct goals in compromising the parser. Experimental results demonstrate that both medium-sized models based on fine-tuning and LLM-based parsers using prompting techniques are vulnerable to this type of attack, with attack success rates as high as 99\% and 89\%, respectively. We hope that this study will raise more concerns about the potential security risks of building natural language interfaces to databases.
Submission Number: 400
Loading