Large Language Models as Interfaces to Structured Data: A Survey

21 Jan 2026 (modified: 21 Apr 2026)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Structured data, including tables, relational databases, and knowledge graphs, underpins a wide range of scientific, industrial, and decision-making workflows. Although large language models (LLMs) are primarily trained on unstructured text, recent work has demonstrated their effectiveness in tasks involving structured data, such as table reasoning, natural language to SQL translation, data transformation, and automated analytics. These developments indicate that LLMs can function as a general interface between natural language inputs, structured representations, and executable operations. This survey presents a theory-oriented overview of LLMs for structured data. We introduce an abstract formulation that characterizes structured data tasks by the structured state, the query or control input, the output space, and the execution environment. Based on this formulation, which we revisit throughout the taxonomy and evaluation sections, we propose a taxonomy that organizes existing methods according to the functional role of the LLM, including encoding, reasoning, translation, planning, and agent-based execution, as well as by representation strategies and learning signals. This taxonomy highlights shared design principles across different task settings and clarifies methodological trade-offs. We examine evaluation protocols, generalization properties, and failure modes specific to structured data tasks, with an emphasis on faithfulness, schema robustness, and execution correctness. Finally, we outline open research directions for LLM-based structured data systems, including challenges related to scalability, symbolic and neural integration, and learning with execution-based supervision. The survey aims to provide a unified conceptual framework and a reference point for future research on large language models applied to structured data.
Submission Type: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Revised paper as per reviews
Assigned Action Editor: ~Ellen_Vitercik1
Submission Number: 7095
Loading