Abstract: Natural Language Interfaces to Databases (NLIDB) systems eliminate the requirement for an end user to use complex query languages like SQL, by translating the input natural language (NL)
queries to SQL automatically. Although a significant volume of
research has focused on this space, most state-of-the-art systems
can at best handle simple select-project-join queries. There has
been little to no research on extending the capabilities of NLIDB
systems to handle complex business intelligence (BI) queries that
often involve nesting as well as aggregation. In this paper, we
present ATHENA++, an end-to-end system that can answer such
complex queries in natural language by translating them into nested
SQL queries. In particular, ATHENA++ combines linguistic patterns from NL queries with deep domain reasoning using ontologies to enable nested query detection and generation. We also introduce a new benchmark data set (FIBEN), which consists of 300
NL queries, corresponding to 237 distinct complex SQL queries
on a database with 152 tables, conforming to an ontology derived
from standard financial ontologies (FIBO and FRO). We conducted
extensive experiments comparing ATHENA++ with two state-ofthe-art NLIDB systems, using both FIBEN and the prominent Spider benchmark. ATHENA++ consistently outperforms both systems across all benchmark data sets with a wide variety of complex queries, achieving 88.33% accuracy on FIBEN benchmark,
and 78.89% accuracy on Spider benchmark, beating the best reported accuracy results on the dev set by 8%.
Loading