Machine Learning within a Graph Database: A Case Study on Link Prediction for Scholarly Data

Published: 01 Jan 2021, Last Modified: 26 Aug 2024ICEIS (1) 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the combination of data management and ML tools, a common problem is that ML frameworks might require moving the data outside of their traditional storage (i.e. databases), for model building. In such scenarios, it could be more effective to adopt some in-database statistical functionalities (Cohen et al., 2009). Such functionalities have received attention for relational databases, but unfortunately for graph-based database systems there are insufficient studies to guide users, either by clarifying the roles of the database or the pain points that require attention. In this paper we make an early feasibility consideration of such processing for a graph domain, prototyping on a state-of-the-art graph database (Neo4j) an in-database ML-driven case study on link prediction. We identify a general series of steps and a common-sense approach for database support. We find limited differences in most steps for the processing setups, suggesting a need for further evaluation. We identify b
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview