Persistent Tor-algebra based stacking ensemble learning (PTA-SEL) for protein-protein binding affinity prediction
Keywords: persistent Tor-algebra, molecular featurization, protein-protein interactions, stacking ensemble learning
TL;DR: We propose persistent Tor-algebra (PTA), PTA-based molecular representation and featurization, PTA-based stacking ensemble learning for protein-protein binding affinity prediction, for the first time.
Abstract: Protein-protein interactions (PPIs) play crucial roles in almost all biological processes. Recently, Data-driven machine learning models have shown great power in the analysis of PPIs. However, efficient molecular representation and featurization are still key issues that hinder the performance of learning models. Here, we propose persistent Tor-algebra (PTA), PTA-based molecular characterization and featurization, and PTA-based stacking ensemble learning (PTA-SEL) for PPI binding affinity prediction, for the first time. More specifically, the Vietoris-Rips complex is used to characterize the PPI structure and its persistent Tor-algebra is computed to form the molecular descriptors. These descriptors then are fed into our stacking model to make the prediction. We systematically test our model on the two most commonly used datasets, i.e., SKEMPI and AB-Bind. It has been found that our model outperforms all the existing models as far as we know, which demonstrates the great power of our model.