ESM-Structure: Codes for Stage-0 pre-training and fine-tuning.
SEPIT: Codes for Stage-1 & 2 pre-training.

We removed all the config files as they contained a large number of non-anonymous messages.

Due to the dataset file being too large (~10G), it is difficult to make it publicly available anonymously. We will open-source it after our paper is accepted. Specific dataset examples can be found in the appendix.