Abstract: In recent years, the need to rapidly develop vaccines and therapeutic proteins to combat viral outbreaks has highlighted the importance of innovation. This study explores the application of diffusion models in de novo protein sequence synthesis. We present a method wherein an autoencoder is employed to reduce input dimensionality, facilitating subsequent training of the diffusion model on latent vectors. Hyperparameter optimisation through grid search enhances model performance. Evaluation metrics include accuracy for autoencoders and Fréchet distance, density, and coverage for diffusion models. Results indicate that the proposed method outperformed existing state-of-the-art methodologies such as ProteinGAN. These findings highlight the efficacy of diffusion models in the generation of de novo amino acid sequences, offering promising avenues for protein engineering and drug development.
Loading