Deep Bidirectional Language-Knowledge Graph Pretraining

Michihiro Yasunaga; Antoine Bosselut; Hongyu Ren; Xikun Zhang; Christopher D Manning; Percy Liang; Jure Leskovec

Deep Bidirectional Language-Knowledge Graph Pretraining

Michihiro Yasunaga, Antoine Bosselut, Hongyu Ren, Xikun Zhang, Christopher D Manning, Percy Liang, Jure Leskovec

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: pretraining, language model, knowledge graph, question answering, commonsense, reasoning, foundation model, self-supervised learning, biomedical

TL;DR: We propose Deep Bidirectional Language-Knowledge Graph Pretraining, a method to pretrain a deeply interactive language-knowledge model from text and knowledge graph at scale, and show strength in knowledge/reasoning-intensive tasks e.g. multi-hop QA

Abstract: Pretraining a language model (LM) on text has been shown to help various downstream NLP tasks. Recent works show that a knowledge graph (KG) can complement text data, offering structured background knowledge that provides a useful scaffold for reasoning. However, these works are not pretrained to learn a deep fusion of the two modalities at scale, limiting the potential to acquire fully joint representations of text and KG. Here we propose DRAGON (Deep Bidirectional Language-Knowledge Graph Pretraining), a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale. Specifically, our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities. We pretrain this model by unifying two self-supervised reasoning tasks, masked language modeling and KG link prediction. DRAGON outperforms existing LM and LM+KG models on diverse downstream tasks including question answering across general and biomedical domains, with +5% absolute gain on average. In particular, DRAGON achieves notable performance on complex reasoning about language and knowledge (+10% on questions involving long contexts or multi-step reasoning) and low-resource QA (+8% on OBQA and RiddleSense), and new state-of-the-art results on various BioNLP tasks. Our code and trained models are available at https://github.com/michiyasunaga/dragon.

Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/deep-bidirectional-language-knowledge-graph/code)

15 Replies

Loading