Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

Ali Behrouz; Ali Parviz; Mahdi Karami; Clayton Sanford; Bryan Perozzi; Vahab Mirrokni

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

Ali Behrouz, Ali Parviz, Mahdi Karami, Clayton Sanford, Bryan Perozzi, Vahab Mirrokni

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Modern sequence models (e.g., Transformers and linear RNNs) emerged as dominant backbones of recent deep learning frameworks, mainly due to their efficiency, representational power, and/or ability to capture long-range dependencies. Recently, adopting these sequence models for graph-structured data has gained popularity as the alternative to Message Passing Neural Networks (MPNNs). There is, however, a lack of a common foundation about what constitutes a good graph sequence model, and a mathematical description of the benefits and deficiencies in adopting different sequence models for learning on graphs. To this end, we introduce the Graph Sequence Model (GSM), a unifying framework for applying sequence models to graph data. The GSM framework allows us to understand, evaluate, and compare the power of different sequence model backbones in graph tasks. Building on this insight, we propose GSM++, a fast hybrid model that hierarchically tokenizes the graph using Hierarchical Affinity Clustering (HAC) and then encodes these sequences via a hybrid architecture. The theoretical and experimental findings confirm that GSM++ outperforms baseline models on most benchmarks.

Lay Summary: In recent years designing more efficient machine learning models that can learn to transfer a sequence to another sequence is gaining much attention. Although adopting these models for graph-structured data has gained popularity, there is a lack of a common foundation about what constitutes a good learning algorithm effective for graphs. In this work, we present a mathematical unifying framework that explains how we can translate a graph into a sequence and then use sequence models to learn the dependencies of entities. Building on our theoretical insights, we propose GSM++, a fast model that hierarchically translates the interactions in graphs into sequences and then encodes these sequences via a novel machine learning model. We provide theoretical and experimental results that confirm the effectiveness of GSM++.

Primary Area: Deep Learning->Graph Neural Networks

Keywords: Graph Learning, Sequence Models, Graph Transformers, Hierarchical Clustering

Submission Number: 7378

Loading