TrojanSQL: SQL Injection against Natural Language Interface to Database

Jinchuan Zhang; Yan Zhou; Binyuan Hui; Yaxin Liu; Ziming Li; Songlin Hu

TrojanSQL: SQL Injection against Natural Language Interface to Database

Jinchuan Zhang, Yan Zhou, Binyuan Hui, Yaxin Liu, Ziming Li, Songlin Hu

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Ethics in NLP

Submission Track 2: NLP Applications

Keywords: text-to-SQL, NLIDB, security, SQL Injection, NL2Code

TL;DR: We propose a novel attack paradigm against natural language interfaces to databases (NLIDB).

Abstract: The technology of text-to-SQL has significantly enhanced the efficiency of accessing and manipulating databases. However, limited research has been conducted to study its vulnerabilities emerging from malicious user interaction. By proposing TrojanSQL, a backdoor-based SQL injection framework for text-to-SQL systems, we show how state-of-the-art text-to-SQL parsers can be easily misled to produce harmful SQL statements that can invalidate user queries or compromise sensitive information about the database. The study explores two specific injection attacks, namely $\textit{boolean-based injection}$ and $\textit{union-based injection}$, which use different types of triggers to achieve distinct goals in compromising the parser. Experimental results demonstrate that both medium-sized models based on fine-tuning and LLM-based parsers using prompting techniques are vulnerable to this type of attack, with attack success rates as high as 99\% and 89\%, respectively. We hope that this study will raise more concerns about the potential security risks of building natural language interfaces to databases.

Submission Number: 400

Loading