Interpretable Clustering on Dynamic Graphs with Recurrent Graph Neural NetworksDownload PDF

Anonymous

05 Feb 2022 (modified: 05 May 2023)ML Reproducibility Challenge 2021 Fall Blind SubmissionReaders: Everyone
Keywords: Deep Learning, RNN, GCN, Dynamic Clustering
Abstract: Scope of Reproducibility: The main goal of the original paper is to perform dynamic node clustering in temporal graphs. The primary objective of this reproducibility study is to verify the major claim of the original paper stating that their proposed hybrid RNN (recurrent neural network) structures with graph convolutional networks (GCNs) outperform state-of-the-art graph clustering approaches. Another major claim of the paper is that under certain assumptions almost exact recovery of node-cluster membership estimations are achievable. Methodology: The proposed models used in the original study utilizes a hybrid RNN-GCN model. RNN learns the approximation of decay rate using temporal graph structure information and GCN predicts the estimations of a node belonging to a certain cluster. In order to validate the claims, we have implemented the code using tensorflow deep learning framework and the code of the original paper using pytorch is available online. The simulation studies for the reproducibility study have been carried out on DELL ALIENWARE m15 R3 machine of an Intel core i7-10750H CPU @2.6 GHz equipped with 16 GB RAM and Windows 10 Home. This machine also has an NVIDIA GeoForce GTX 1660 Ti GPU with 6GB memory. Results: The simulation results are inconclusive. Since the exact training and test data used by the authors of the original paper are not retrievable, the simulation results of the reproducibility do not always hold the claims of the original paper. As per the reported results from our reproducibility study, the baseline methods sometimes outperform the proposed models, even though the performance gap ($\leq1\%$) is very low in a majority of the cases. What was easy: The paper is very well written. The proposed models include sufficient algorithmic explanations to implement the code effortlessly. What was difficult: The original paper does not include any explanation regarding the choice of hyperparameters. The number of simulation runs and any confidence interval on the performance metrics have not been explicitly specified in the paper either. The actual training and test data points used to report the results of the original paper are not retrievable. For these reasons, it becomes difficult to validate the claims and comprehend the overall semantics of the simulation results. Communication with original authors: We had a suspicion that the original paper mistakenly reported the area under the curve (AUC) metric in place of F1-score and vice-versa. Hence, we had reached out to the authors of the original paper regarding some of our queries. The authors promptly responded and admitted that we were right. The authors also replied about the number of simulation runs used to report the results that was not earlier mentioned in the original paper.
Paper Url: https://ojs.aaai.org/index.php/AAAI/article/view/16590
Paper Venue: AAAI 2021
Supplementary Material: zip
4 Replies

Loading