High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text Attributed Graphs

Peiyan Zhang; Chaozhuo Li; Liying Kang; Feiran Huang; Senzhang Wang; Xing Xie; Sunghun Kim

High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text Attributed Graphs

Peiyan Zhang, Chaozhuo Li, Liying Kang, Feiran Huang, Senzhang Wang, Xing Xie, Sunghun Kim

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24EveryoneRevisionsBibTeX

Keywords: Graph Neural Networks, Transformer, Contrastive Learning

Abstract: We investigate node representation learning on text-attributed graphs (TAGs), where nodes are associated with text information. Although recent studies on graph neural networks (GNNs) and pretrained language models (PLMs) have exhibited their power in encoding network and text signals, respectively, less attention has been paid to delicately coupling these two types of models on TAGs. Specifically, existing GNNs rarely model text in each node in a contextualized way; existing PLMs can hardly be applied to characterize graph structures due to their sequence architecture. To address these challenges, we propose HASH-CODE, a High-frequency Aware Spectral Hierarchical Contrastive Selective C\textbf{d}ing method that integrates GNNs and PLMs into a unified model. Different from previous “cascaded architectures” that directly add GNN layers upon a PLM, our HASH-CODE relies on five self-supervised optimization objectives to facilitate thorough mutual enhancement between network and text signals in diverse granularities. Moreover, we show that existing contrastive objective learns the low-frequency component of the augmentation graph and propose a high-frequency component (HFC)-aware contrastive learning objective that makes the learned embeddings more distinctive. Extensive experiments on six real-world benchmarks substantiate the efficacy of our proposed approach. In addition, theoretical analysis and item embedding visualization provide insights into our model interoperability.

Track: Web Mining and Content Analysis

Submission Guidelines Scope: Yes

Submission Guidelines Blind: Yes

Submission Guidelines Format: Yes

Submission Guidelines Limit: Yes

Submission Guidelines Authorship: Yes

Student Author: Yes

Submission Number: 1797

Loading