From SQL to Knowledge Graphs: An LLM-Driven MultiAgent Approach with Data Schema Improvement

TMLR Paper6951 Authors

09 Jan 2026 (modified: 15 Jan 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: RDBMS (Relational Database Management System) databases face several limitations, including slow execution with multi-hop queries and a lack of explainability through graphical interpretations. In contrast, graph databases offer a more intuitive and efficient data schema that enables faster execution on large datasets. Most existing RDBMS conversion pipelines focus on running traditional loading commands and relying on Cypher queries. However, the efficiency of using an LLM to generate an effective graph data schema, significantly reducing the ambiguity of the graph database, remains underexplored in the current research literature. This paper presents a novel algorithm that bridges RDBMS and graph databases by using an LLM-powered ETL agent to standardize table and column names before saving them to the Data Mart. A Multi-Agent System generates a looping discussion between ETL, Analyzer, and Graph agents to optimize the final design through an iterative process of suggesting and scoring the graph database schema. We ensure that the final graph database meets three criteria before being accepted for data conversion: Accuracy, Groundedness, and Faithfulness. This system demonstrates an effective pipeline to automatically convert a tabular database into a graph database through a comprehensive end-to-end process. Our study highlights notable efficiency gains from using the converted graph database, evaluated on 1,081 samples of a BFSI dataset across three levels of complexity (easy, medium, and hard). Specifically, CypherAgent achieves an 85.6% accuracy for Q&A tasks using the graph database, which is 12.12% higher than the accuracy achieved by an SQLAgent on the PostgreSQL RDBMS across all queries. Additionally, the graph database demonstrates faster performance, reducing latency by approximately three times. For easy, medium, and hard queries, the graph database attains accuracies of 90.43%, 81.98%, and 80.06%, respectively, surpassing the RDBMS by 17.8%, 4.2%, and 11.0%.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Hsuan-Tien_Lin1
Submission Number: 6951
Loading