Understanding and Tackling Over-Dilution in Graph Neural Networks

Junhyun Lee; Veronika Thost; Bumsoo Kim; Jaewoo Kang; Tengfei Ma

Understanding and Tackling Over-Dilution in Graph Neural Networks

Junhyun Lee, Veronika Thost, Bumsoo Kim, Jaewoo Kang, Tengfei Ma

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: learning on graphs and other geometries & topologies

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Graph Neural Networks, Message Passing Neural Networks, Over-dilution, Transformers

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We present a novel perspective on the limitations of message passing neural networks, the over-dilution phenomenon.

Abstract: Message Passing Neural Networks (MPNNs) have become the predominant architecture for representation learning on graphs. While they hold promise, several inherent limitations have been identified, such as over-smoothing and over-squashing. Both theoretical frameworks and empirical investigations substantiate these limitations, facilitating advancements for informative representation. In this paper, we investigate the limitations of MPNNs from a novel perspective. We observe that even in a single layer, a node's own information can become considerably diluted, potentially leading to negative effects on performance. To delve into this phenomenon in-depth, we introduce the concept of *Over-dilution* and formulate it with two types of dilution factors: *intra-node dilution* and *inter-node dilution*. *Intra-node dilution* refers to the phenomenon where attributes lose their influence within each node, due to being combined with equal weight regardless of their practical importance. *Inter-node dilution* occurs when the node representations of neighbors are aggregated, leading to a diminished influence of the node itself on the final representation. We also introduce a transformer-based solution, which alleviates over-dilution by merging attribute representations based on attention scores between node-level and attribute-level representations. Our findings provide new insights and contribute to the development of informative representations.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5926

Loading