Data for pretraining stage can be downloaded from the following website:
wget https://bioos-hermite-beijing.tos-cn-beijing.volces.com/unimol_data/pretrain/ligands.tar.gz 
After downloading, unzip it to data/ligands
Build the environment with docker:
docker pull dptechnology/unimol:latest-pytorch1.11.0-cuda11.3
Then start the docker interactively:
docker run --gpus all -it --shm-size=2g -v "$(pwd):/workspace" dptechnology/unimol:latest-pytorch1.11.0-cuda11.3
This command will mount current directory to /workspace in docker environment and start an interactive shell within docker.
After this,  cd ../workspace
Now we can train the model with 
python scripts/train_molecule.sh
Checkpoint will be saved to ./save/pretrain_ligand_myloss/
Default batch_size=64,can be changed in scripts/train_molecule.sh
Default weight distribution is gamma.