Abstract: Synthesizing SQL queries from natural language is a long-standing open problem and has been attracting considerable interest recently. Toward solving the problem, the de facto approach is to employ a sequence-to-sequence-style model. Such an approach will necessarily require the SQL queries to be serialized. Since the same SQL query may have multiple equivalent serializations, training a sequence-to-sequence-style model is sensitive to the choice from one of them. This phenomenon is documented as the "order-matters" problem. Existing state-of-the-art approaches rely on reinforcement learning to reward the decoder when it generates any of the equivalent serializations. However, we observe that the improvement from reinforcement learning is limited.
In this paper, we propose a novel approach, i.e., SQLNet, to fundamentally solve this problem by avoiding the sequence-to-sequence structure when the order does not matter. In particular, we employ a sketch-based approach where the sketch contains a dependency graph, so that one prediction can be done by taking into consideration only the previous predictions that it depends on. In addition, we propose a sequence-to-set model as well as the column attention mechanism to synthesize the query based on the sketch. By combining all these novel techniques, we show that SQLNet can outperform the prior art by 9% to 13% on the WikiSQL task.
Code: [![Papers with Code](/images/pwc_icon.svg) 13 community implementations](https://paperswithcode.com/paper/?openreview=SkYibHlRb)
Data: [WikiSQL](https://paperswithcode.com/dataset/wikisql)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 14 code implementations](https://www.catalyzex.com/paper/sqlnet-generating-structured-queries-from/code)
14 Replies
Loading