Abstract: We introduce Griffin, the first foundation model attemptation designed specifically for Relational Databases (RDBs). Unlike previous smaller models focused on single RDB tasks, Griffin unifies the data encoder and task decoder to handle diverse tasks. Additionally, we enhance the architecture by incorporating a cross-attention module and a novel aggregator. Griffin utilizes pretraining on both single-table and RDB datasets, employing advanced encoders for categorical, numerical, and metadata features, along with innovative components such as cross-attention modules and enhanced message-passing neural networks (MPNNs) to capture the complexities of relational data. Evaluated on large-scale, heterogeneous, and temporal graphs extracted from RDBs across various domains (spanning over 150 million nodes), Griffin demonstrates superior or comparable performance to individually trained models, excels in low-data scenarios, and shows strong transferability with similarity and diversity in pretraining across new datasets and tasks, highlighting its potential as a universally applicable foundation model for RDBs. Code available at https://github.com/yanxwb/Griffin.
Lay Summary: While computers are great with text and images, they often struggle with the complex databases that businesses and scientists use every day. These "relational databases" are tricky because they store all sorts of different information across many interconnected tables, making it hard for a single AI to understand it all.
We've built an AI called "Griffin" that learns to see these databases like interconnected maps. Griffin is specially trained in steps to understand the varied data types it finds and, crucially, how all the different tables and pieces of information relate to each other.
This new approach helps Griffin make accurate predictions using the database information. More importantly, Griffin can use what it has learned on new, unfamiliar databases, even if there's very little data available for a new problem. Our work aims to make AI much better at finding valuable insights hidden inside the everyday databases that power our world.
Link To Code: https://github.com/yanxwb/Griffin
Primary Area: Deep Learning->Foundation Models
Keywords: Relational Database, Foundation Model
Submission Number: 3022
Loading