Transmembrane Protein Inter-Helical Residue Contacts Prediction Using Transductive Support Vector Machines
Abstract: Protein functions are strongly related to their 3D structure. Therefore, it is crucial to
identify their structure to understand how they function. Studies have shown that
numerous numbers of proteins cross a biological membrane, called transmembrane (TM)
proteins, and many of them adopt alpha helices shape. How these helices contact one
another inside the membrane plays a major role in their tilt angle and relative position
and hence the overall structure of the protein. To tackle the sparsity issue of labelled data,
which is usually the case in amino acids residues contacts prediction, we adopt a
transductive learning approach, which involves the unlabeled test data during training in
order to obtain a better model. Using features extracted from protein structures, we
compare transductive support vector machine (SVM) and inductive SVM in predicting
helix-helix residues contacts to identify conditions and limitations where TSVMs gain
performance and investigate the performance degradation of the TSVM and the best
remedial solutions in the literature. In particular, we develop an early stop technique
πππππΈπ that generates a more accurate model and outperforms the state of art TSVM by
5%, as tested on a benchmark set of transmembrane proteins.B
Loading