Graph2Token: Make LLMs Understand Molecule Graphs

Published: 17 Jun 2024, Last Modified: 16 Jul 2024AccMLBio PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Molecule Graph Token Alignment, Graph Tokenizer, LLM token vocabulary, Lightweight Solution.
TL;DR: This paper proposes an efficient and lightweight method that aligns a molecular graph token to LLM token.
Abstract: Large language models (LLMs) excel at various text-related tasks. However, it is still challenging for them to process graph data such as molecules. To bridge this gap, this paper proposes Graph2Token, an efficient solution that aligns a graph token to LLM tokens. The key idea is to represent a graph token with the LLM token vocabulary, without finetuning the backbone of LLM. In this way, we can unleash the potential of existing LLMs, which helps the downstream molecule prediction tasks. Extensive experiments demonstrate the effectiveness of our proposed Graph2Token. Code is available at https://github.com/ZeLeBron/Graph2Token.
Submission Number: 50
Loading