Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Lehman Go Indifferent

Lorenz Kummer; Samir Moustafa; Nils Morten Kriege; Wilfried N. Gansterer

Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Lehman Go Indifferent

Lorenz Kummer, Samir Moustafa, Nils Morten Kriege, Wilfried N. Gansterer

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Bit Flip Attacks, Graph Neural Network

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: The Injectivity Bit Flip Attack exploits GNNs mathematical properties to increase their vulnerability to bit flips and degrade their ability to distinguish graph structures, leading to greater destruction with fewer flips than traditional attacks.

Abstract: Prior attacks on graph neural networks have mostly focused on graph poisoning and evasion, neglecting the network’s weights and biases. Traditional weight-based fault injection attacks, such as bit flip attacks used for convolutional neural networks, do not consider the unique properties of graph neural networks. We propose the Injectivity Bit Flip Attack, the first bit flip attack designed specifically for graph neural networks. Our attack targets the learnable neighborhood aggregation functions in quantized message passing neural networks, degrading their ability to distinguish graph structures and losing the expressivity of the Weisfeiler-Lehman test. Our findings suggest that exploiting mathematical properties specific to certain graph neural network architectures can significantly increase their vulnerability to bit flip attacks. Injectivity Bit Flip Attacks can degrade the maximal expressive Graph Isomorphism Networks trained on various graph property prediction datasets to random output by flipping only a small fraction of the network’s bits, demonstrating its higher destructive power compared to a bit flip attack transferred from convolutional neural networks. Our attack is transparent and motivated by theoretical insights which are confirmed by extensive empirical results.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5313

Loading