Abstract: This study aims to employ computational methods for the accurate identification of vesicular transport proteins. The identification of these proteins holds great significance in enhancing our understanding of their protein family structure, thereby enabling the design of more effective drug targets for individuals afflicted with endocrine disorders. In recent times, researchers in the field of biology have increasingly sought to leverage deep learning techniques to address this challenge. In order to further enhance the classification performance, we investigated the following models incorporating distinct features: (1) We devised a novel protein feature called AAC_PSSM by amalgamating amino acid composition (AAC) and position-specific scoring matrix (PSSM) features. Subsequently, a gated recurrent unit (GRU) model was employed to learn such features; (2) An ensemble model was constructed by combining the existing GRU model with the model of a neural network featuring the AAC feature; (3) Random forest analysis was conducted using the pseudo-amino acid composition (PseAAC) feature; (4) Furthermore, we explored a natural language processing (NLP) approach by considering the protein sequence as a natural language and applying various neural network architectures. Upon analyzing the results obtained from the different models, it was observed that the ensemble model incorporating PSSM and AAC features exhibited the highest sensitivity of 81.03% and accuracy of 82.43%. Notably, our proposed model surpassed the performance of state-of-the-art models addressing the same problem and datasets, thus establishing its superiority.
Loading