Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs

TMLR Paper4403 Authors

05 Mar 2025 (modified: 20 Mar 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The increasing demand for deep learning-based foundation models has highlighted the importance of efficient data retrieval mechanisms. Neural graph databases (NGDBs) offer a compelling solution, leveraging neural spaces to store and query graph-structured data, thereby enabling LLMs to access precise, contextually relevant information. However, current NGDBs are constrained to single-graph operation, limiting their capacity to reason across multiple, distributed graphs. Furthermore, the lack of support for multi-source graph data in existing NGDBs hinders their ability to capture the complexity and diversity of real-world data. In many applications, data is distributed across multiple sources, and the ability to reason across these sources is crucial for making informed decisions. This limitation is particularly problematic when dealing with sensitive graph data, as directly sharing and aggregating such data poses significant privacy risks. As a result, many applications that rely on NGDBs are forced to choose between compromising data privacy or sacrificing the ability to reason across multiple graphs. To address these limitations, we propose to learn Federated Neural Graph DataBases (FedNGDBs), a pioneering systematic framework that empowers privacy-preserving reasoning over multi-source graph data. FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities and improving the overall quality of the graph data. Unlike existing methods, FedNGDBs can handle complex graph structures and relationships, making it suitable for various downstream tasks. We evaluate FedNGDBs on three real-world datasets, demonstrating its effectiveness in retrieving relevant information from multi-source graph data while keeping sensitive information secure on local devices. Our results show that FedNGDBs can efficiently retrieve answers to cross-graph queries, making it a promising approach for LLMs and other applications that rely on efficient data retrieval mechanisms.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Peilin_Zhao2
Submission Number: 4403
Loading