Abstract: This paper describes a targeted drug design method based on a framework of multiscale encoder-decoder. Encoders are used to encode target gene and protein features. A decoder is used to design drugs based on target features. This method fuses target gene and protein information for targeted drug design, and invokes effective feature extraction strategies. A multilevel gene feature extraction (MGFE) is proposed to extract multilevel target features by extracting base and codon features in gene expression. The process of extracting features from nucleotide sequences by MGFE based gene encoder simulates the process of gene transcription and translation. Meanwhile, a multi-embedding protein feature extraction (MPFE) is proposed to extract target protein features from amino acid sequences. The MPFE based protein encoder includes three embedding layers which provides a unique linear layer for each amino acid. According to structural characteristics of proteins, amino acids with different positions but the same type are embedded into the same embedding vector without location encoding. Finally, a gated recurrent unit based drug decoder is used to decode gene and protein features, and creates new targeted drugs. The experiments adminstrate that the proposed method outperforms the previous ones in terms of validity, novelty and binding affinity.
Loading