HOTGpred: Enhancing human O-linked threonine glycosylation prediction using integrated pretrained protein language model-based features and multi-stage feature selection approach
Abstract: Highlights•HOTGpred is the first tool to utilize PLM-based embeddings and a multi-stage feature selection process to identify OTGs.•Twenty-five feature sets and nine classifiers were evaluated to identify the most discriminative features for OTG prediction.•HOTGpred significantly outperformed all existing tools in both balanced and imbalanced tests.•SHAP and ablation analyses identified the features that contributed most significantly to HOTGpred's performance.•We have deployed a web server, freely available at https://balalab-skku.org/HOTGpred/, for identifying OTGs.
Loading