MC-MSTLoc: Self-Supervised Pre-Training for Imbalanced Multi-Label Protein Subcellular Localization Prediction Using Immunofluorescence Images
Abstract: With the rapid growth of high-resolution microscopy imaging data, current protein subcellular localization methods often face the problem of imbalanced data with long-tailed distributions in large-scale protein data. To address this challenge, this paper proposes a self-supervised pre-training method called MC-MSTLoc. Aiming to maximize feature consistency and inconsistency of microscopy imaging data, the pre-training scheme is proposed based on contrastive task at scale and view levels, which substantially improves the quality of the learned feature representations. Experimental results on benchmark datasets demonstrate that MC-MSTLoc outperforms existing self-supervised pretraining methods for protein subcellular localization prediction. Model ablation experiments and pretraining effectiveness analysis confirm the method performance. Additionally, model visualization analysis and interpretability experiments demonstrate the crucial role of the method in learning information distribution and patterns of different subcellular locations.
External IDs:dblp:journals/tcbb/WangQGW25
Loading