Towards characterizing the value of edge embeddings in Graph Neural Networks

Dhruv Rohatgi; Tanya Marwah; Zachary Chase Lipton; Jianfeng Lu; Ankur Moitra; Andrej Risteski

Towards characterizing the value of edge embeddings in Graph Neural Networks

Dhruv Rohatgi, Tanya Marwah, Zachary Chase Lipton, Jianfeng Lu, Ankur Moitra, Andrej Risteski

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We characterize the added representational power conferred to by edge embeddings in GNNs, considering additional constraints like memory and symmetry.

Abstract: Graph neural networks (GNNs) are the dominant approach to solving machine learning problems defined over graphs. Despite much theoretical and empirical work in recent years, our understanding of finer-grained aspects of architectural design for GNNs remains impoverished. In this paper, we consider the benefits of architectures that maintain and update edge embeddings. On the theoretical front, under a suitable computational abstraction for a layer in the model, as well as memory constraints on the embeddings, we show that there are natural tasks on graphical models for which architectures leveraging edge embeddings can be much shallower. Our techniques are inspired by results on time-space tradeoffs in theoretical computer science. Empirically, we show architectures that maintain edge embeddings almost always improve on their node-based counterparts---frequently significantly so in topologies that have "hub" nodes.

Lay Summary: Graphs are a useful way of encoding "similarity" structure in data: for example, chemical bonds in molecules, or friendships in social networks. Graph Neural Networks (GNNs) learn to predict properties of graph-structured data -- such as chemical properties of a given molecule. Typically, they do this by associating some numerical features to each node of the graph, and then iteratively updating these features based on the features of neighboring nodes. However, a different paradigm is to associate features with *edges* of the graph. When might this lead to more accurate or faster prediction? In this paper, we study this question by looking at how many iterations GNNs need, and how much numerical information they need to "remember" at each node or edge, in order to compute a given graph property. We show that when edge-based GNNs and node-based GNNs are constrained to the same amount of memory, the edge-based GNNs can compute some graph properties with much fewer iterations than node-based GNNs. Intuitively, node-based GNNs are bottlenecked by "hub nodes" that need to pass lots of information to many neighboring nodes, whereas edge-based GNNs are not. We find similar results in our experiments on both artificial graphs and real-world benchmarks from chemistry and computer vision. Our results suggest a fundamental benefit of edge-based GNNs, but they also have downsides (particularly on graphs with far more edges than nodes). A key direction for future research is to get the "best of both worlds" between the edge-based and node-based paradigms.

Primary Area: Theory->Deep Learning

Keywords: graph neural networks, theory, representational power, communication complexity, memory tradeoffs, edge embeddings

Submission Number: 12858

Loading