GENIE: Watermarking Graph Neural Networks for Link Prediction

Published: 08 Apr 2026, Last Modified: 08 Apr 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The rapid adoption, usefulness, and resource-intensive training of Graph Neural Network~(GNN) models have made them an invaluable intellectual property in graph-based machine learning. However, their wide-spread adoption also makes them susceptible to stealing, necessitating robust Ownership Demonstration~(OD) techniques. Watermarking is a promising OD framework for deep neural networks, but existing methods fail to generalize to GNNs due to the non-Euclidean nature of graph data. Existing works on GNN watermarking primarily focus on node and graph classification, overlooking Link Prediction (LP). In this paper, we propose \genie~(watermarking \textbf{G}raph n\textbf{E}ural \textbf{N}etworks for l\textbf{I}nk pr\textbf{E}diction), the first scheme to watermark GNNs for LP. \genie creates a novel backdoor for both node-representation and subgraph-based LP methods, utilizing a unique trigger set and a secret watermark vector. Our OD scheme is equipped with Dynamic Watermark Thresholding~(DWT), ensuring high verification probability while addressing practical issues in existing OD schemes. We extensively evaluate \genie across 4~diverse model architectures~(\ie SEAL, GCN, GraphSAGE and NeoGNN), 7~real-world datasets and 21~watermark removal techniques and demonstrate its robustness to watermark removal and ownership piracy attacks. Finally, we discuss adaptive attacks against \genie and a defense strategy to counter it.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1. **Methodology Clarifications:** Clarified the definitions of the loss functions and optimizers utilized in Section 4.2. 2. **Statistical Rigor:** Explicitly stated the statistical independence assumptions underlying the Dynamic Watermark Thresholding (DWT) procedure in Section 4.3.2. 3. **Pretrained Model Baselines:** Added the clean test AUC ($C_{test}$) to Table 5 to provide a direct baseline comparison, as requested by the reviewer. 4. **Threat Model Distinctions:** Detailed the fundamental differences between the threat models for inductive and transductive link prediction in Appendix G.1. 5. **Distribution Analysis:** Expanded the discussion in Appendix G.2 to further explain the effect of the watermark vector distribution. 6. **Baseline Performance:** Included Table 40 in Appendix I to systematically analyze the baseline performance of the non-watermarked models. 7. **Expressive Architectures:** Extended the evaluation of GENIE in Appendix J to demonstrate its robustness on more expressive architectures, specifically GAT and GIN. 8. **Feature Heterophily:** Analyzed the impact of feature heterophily on the GENIE framework by evaluating it across 7 additional datasets in Appendix K. 9. **Notation Directory:** Added Appendix L to provide a comprehensive summary table of all mathematical notations, cross-referencing where each is first introduced in the text. 10. **Algorithmic Outlines:** Provided the complete, step-by-step algorithms for both node-representation and subgraph-based link prediction methods using GENIE in Appendix M. 11. **Conclusion Expansion:** Expanded the Conclusion (Section 7) to explicitly summarize the limitations of the current framework and outline potential directions for future work.
Video: https://drive.google.com/drive/folders/1zJxvt3utFtjUzy7m_7IlDrfnLVXvMlrh?usp=sharing
Code: https://github.com/CiaoAnkit/GENIE-Watermarking-Graph-Neural-Networks-for-Link-Prediction
Assigned Action Editor: ~Alessandro_De_Palma1
Submission Number: 6549
Loading