Abstract: Highlights•This method explores transfer learning of large DTI datasets to predict specific protein-drug interactions by leveraging the model’s ability to learn interaction features from the sequences.•Unlike previous methods that encode drugs and proteins using individual amino acids or atoms, we utilize a subsequences vocabulary to embed sequences, preserving functional units.•The BERT-based model combines protein and drug sequences during encoding and calculates attention scores between their subsequences, facilitating the exploration of the interaction module.
Loading