Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction PredictionDownload PDF

Published: 11 Apr 2022, Last Modified: 05 May 2023RC2021Readers: Everyone
Keywords: Inter-novel protein interaction, prediction, graph neural network, correlation, reproducement
TL;DR: Reproducing paper titled Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction
Abstract: \section*{\centering Reproducibility Summary} \textit{This is a report of reproducibility of paper \cite{lv2021learning}, submited to \href{https://paperswithcode.com/rc2021}{ML Reproducibility Challenge 2021}.} \subsection*{Scope of Reproducibility} In the paper the authors propose a new evaluation that respects inter-novel-protein interactions, and also a new method, that significantly outperforms previous PPI methods, especially under this new evaluation. Therefore we will first inspect if this kind of evaluation is objectively better, and secondly, we will try to reproduce the results of the proposed model in comparison with previous state-of-the-art, PIPR. \subsection*{Methodology} For the reproduction we used authors \href{https://github.com/lvguofeng/GNN_PPI}{code}, slightly changing the pipeline for automatization. We also used \href{https://github.com/muhaochen/seq_ppi}{PIPR code}, where we completely changed the pipeline, to be able to use it on the same datasets as GNN-PPI, but used their function for building the model. The experiments were run on Nvidia Titan X GPU, using around 250 GPU hours altogether. \subsection*{Results} We reproduced the papers results within standard deviations of our repeated experiments. But in some cases, this still means there is a big difference between the performances, which is coming from different train-test splits of the newly proposed splitting schemes. Even with these discrepancies we still managed to (at least partially) confirm all authors claims. The proposed model GNN-PPI performed better than PIPR overall and for inter-novel-protein interactions, evaluation on their proposed schemes predicted the generalization performance better, and their model is also robust for predictions for newly discovered proteins -- here our results were surprising, they were even better when the network was built knowing fewer proteins. \subsection*{What was easy} It was easy to run GNN-PPI code on different datasets and with different parameters, as their repository is nicely organized and the code is clearly structured. It was also easy to understand their idea of the problem, the reasons for new evaluation and the framework of their proposed model. \subsection*{What was difficult} In both models used in this reproduction, the environment setup was harder than expected. There was no documentation or comments in the code, which made it hard at first to understand it. Some debugging was needed for GNN-PPI and a lot of code changes for PIPR to train well.
Paper Url: https://arxiv.org/pdf/2105.06709.pdf
Paper Venue: IJCAI 2021
0 Replies