DeepPurpose++: A Benchmark Study on Learning Based Protein Discovery

Sunstella 2023 Summer Research Camp Submission9 Authors

15 Jun 2023 (modified: 22 Jun 2023)Sunstella 2023 Summer Research Camp SubmissionEveryoneRevisions
Keywords: Protein Sequence Understanding, Protein Learning Benchmarks and Datasets
TL;DR: A python library that benchmarks protein learning tasks
Abstract: Convolution networks and transformer techniques have demonstrated remarkable efficacy in protein learning tasks. However, a number of state-of-the-art graph neural network architectures with pre-training frameworks, and reinforcement learning methods are not adequately represented in the latest protein benchmark. While libraries such as DeepPurpose have incorporated these structure-based or agent-based approaches, protein tasks are still incomplete. The provided interfaces are not as user-friendly as they could be. To tackle those problems, we introduce $DeepPurpose++$, an extenion platform of DeepPurpose which is designed to integrate more powerful models for protein learning. We plan to conduct 12 benchmark tests across 4 benchmark groups. A comparison will be made between transformer pre-training and GNN-based pre-training. Additionally, we plan to optimize both the code and interface to improve user experience and performance.
Submission Number: 9