Capture Long-Range Dependency with Meta-Path Transformer for De-Anonymization of Q&A Sites

Baojie Tian, Liangjun Zang, Jizhong Han, Songlin Hu

Published: 01 Jan 2024, Last Modified: 08 Aug 2024CSCWD 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The expeditious advancement of social question-and-answer (Q&A) platforms has led to the valuable yet challenging practice of anonymous knowledge sharing. Despite implementing various anonymity techniques, the persistent threat of potential privacy breaches remains a paramount concern. To tackle this issue, we introduce the task of de-anonymization within Q&A communities and provide a bilingual dataset (Chinese and English) for research. In this paper, we propose a novel de-anonymization framework called MPT, effectively improving the model’s ability to capture long-range dependencies between nodes by integrating graph neural networks(GNNs) and language models(LMs). Specifically, we use GNN to extract structural features, and then we encode and fuse node representations from multiple meta-paths using Transformer and attention mechanisms. Extensive experiments on Zhihu and Quora data sets show that our model significantly outperforms the baseline model. In addition, our model possesses a degree of interpretability, enabling a comprehensive comprehension of the underlying factors contributing to user privacy breaches and facilitating the implementation of appropriate safeguards. The dataset 1 and code 2 utilized in this study have been made publicly accessible.