Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs

Published: 08 Jul 2025, Last Modified: 08 Jul 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The increasing demand for deep learning-based foundation models has highlighted the importance of efficient data retrieval mechanisms. Neural graph databases (NGDBs) offer a compelling solution, leveraging neural spaces to store and query graph-structured data, thereby enabling LLMs to access precise and contextually relevant information. However, current NGDBs are constrained to single-graph operation, limiting their capacity to reason across multiple, distributed graphs. Furthermore, the lack of support for multi-source graph data in existing NGDBs hinders their ability to capture the complexity and diversity of real-world data. In many applications, data is distributed across multiple sources, and the ability to reason across these sources is crucial for making informed decisions. This limitation is particularly problematic when dealing with sensitive graph data, as directly sharing and aggregating such data poses significant privacy risks. As a result, many applications that rely on NGDBs are forced to choose between compromising data privacy or sacrificing the ability to reason across multiple graphs. To address these limitations, we propose to learn Federated Neural Graph DataBase (FedNGDB), a pioneering systematic framework that empowers privacy-preserving reasoning over multi-source graph data. FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities, and improving the overall quality of graph data. Unlike existing methods, FedNGDB can handle complex graph structures and relationships, making it suitable for various downstream tasks. We evaluate FedNGDBs on three real-world datasets, demonstrating its effectiveness in retrieving relevant information from multi-source graph data while keeping sensitive information secure on local devices. Our results show that FedNGDBs can efficiently retrieve answers to cross-graph queries, making it a promising approach for LLMs and other applications that rely on efficient data retrieval mechanisms.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Here’s a summary of the key revisions compared to the previous version: - Claims and Experimental Analysis: We carefully reviewed all statements to ensure accuracy and avoid overstating performance, and expanded the comparative analysis and clarified performance benchmarks (e.g., improved Figure 4 to better differentiate query-type performance). - Experimental Design and Evaluation Update: We elaborated on why current datasets were chosen and polished the experiment discussion. - Methodological Details: We improved the methodology description to make it clearer. - Figure Improvement: We improved Figure 2 and Figure 4 to make the illustrations clearer. - Experimental Results: We supplemented the experimental results to make a better illustration.
Code: https://github.com/HKUST-KnowComp/FedNGDB
Assigned Action Editor: ~Peilin_Zhao2
Submission Number: 4403
Loading