Do Transformers Really Perform Badly for Graph Representation?Download PDF

21 May 2021, 20:41 (edited 21 Jan 2022)NeurIPS 2021 PosterReaders: Everyone
  • Keywords: Transformer, Graph Representation, OGB-LSC
  • TL;DR: We have explored the direct application of Transformers to graph representation. With three simple, yet effective graph structural encodings, the proposed GraphFormer works surprisingly well on a wide range of popular benchmark datasets.
  • Abstract: The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model graph-structured data. Besides, we mathematically characterize the expressive power of Graphormer and exhibit that with our ways of encoding the structural information of graphs, many popular GNN variants could be covered as the special cases of Graphormer. The code and models of Graphormer will be made publicly available at \url{}.
  • Supplementary Material: pdf
  • Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
  • Code:
13 Replies