Graph Masked Language Models

Graph Masked Language Models

TMLR Paper4535 Authors

22 Mar 2025 (modified: 30 May 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Language Models (LMs) and Graph Neural Networks (GNNs) have shown great promise in their respective areas, yet integrating structured graph data with rich textual information remains challenging. In this work, we propose \emph{Graph Masked Language Models} (GMLM), a novel dual-branch architecture that combines the structural learning of GNNs with the contextual power of pretrained language models. Our approach introduces two key innovations: (i) a \emph{semantic masking strategy} that utilizes graph topology to selectively mask nodes based on their structural importance, and (ii) a \emph{soft masking mechanism} that interpolates between original node features and a learnable mask token, ensuring smoother information flow during training. Extensive experiments on multiple node classification and language understanding benchmarks demonstrate that GMLM not only achieves state-of-the-art performance but also exhibits enhanced robustness and stability. This work underscores the benefits of integrating structured and unstructured data representations for improved graph learning.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: ### Edit 1 - Improved grammar and typography of the paper - Added dataset statistics in Appendix B - Included compute and evaluation details for GLUE (graph construction for evaluating on GLUE) in Section 4. ### Edit 2 - Added graph classification tasks - Re-ordered and re-wrote some sections to stay within the page limit to accommodate the new included evaluations - Provided rationale and theoretical justification in Appendix B and C ### Edit 3 - Added link prediction tasks - Provided a clear explanation of the PLMs role in the models architecture - Added evaluations against additional graph masking strategies such as GraphMAE2 and UGMAE.

Assigned Action Editor: ~Vicenç_Gómez1

Submission Number: 4535

Loading