GOFA: A Generative One-For-All Model for Joint Graph Language Modeling

Lecheng Kong; Jiarui Feng; Hao Liu; Chengsong Huang; Jiaxin Huang; Yixin Chen; Muhan Zhang

GOFA: A Generative One-For-All Model for Joint Graph Language Modeling

Lecheng Kong, Jiarui Feng, Hao Liu, Chengsong Huang, Jiaxin Huang, Yixin Chen, Muhan Zhang

Published: 22 Jan 2025, Last Modified: 01 Mar 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: GNN;Graph foundation model;LLM;

TL;DR: We propose a generative graph language model, that achieve good performance on a variety of tasks, showing potential to serve as a future graph foundation model.

Abstract: Foundation models, such as Large Language Models (LLMs) or Large Vision Models (LVMs), have emerged as one of the most powerful tools in the respective fields. However, unlike text and image data, graph data do not have a definitive structure, posing great challenges to developing a Graph Foundation Model (GFM). For example, current attempts at designing general graph models either transform graph data into a language format for LLM-based prediction or still train a GNN model with LLM as an assistant. The former can handle unlimited tasks, while the latter captures graph structure much better---yet, no existing work can achieve both simultaneously. In this paper, we first identify three key desirable properties of a GFM: self-supervised pretraining, fluidity in tasks, and graph awareness. To account for these properties, we extend the conventional language modeling to the graph domain and propose a novel generative graph language model GOFA. The model interleaves randomly initialized GNN layers into a frozen pre-trained LLM so that the semantic and structural modeling abilities are organically combined. GOFA is pre-trained on newly proposed graph-level next-word prediction, question-answering, structural understanding, and information retrieval tasks to obtain the above GFM properties. The pre-trained model is further instruction fine-tuned to obtain the task-solving ability. Our GOFA model is evaluated on various downstream datasets unseen during the pre-training and fine-tuning phases, demonstrating a strong ability to solve structural and contextual problems in zero-shot scenarios. The code is available at https://github.com/JiaruiFeng/GOFA.

Supplementary Material: zip

Primary Area: learning on graphs and other geometries & topologies

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2702

Loading