## Recommended Environment
Docker image: `nvcr/nvidia/pytorch:22.02-py3`

## Data

Please process the data following the previous work and put the processed data to `/mnt/azstorage/wmt14_en_de_joined_dict/`

## Training

To train a MoE with 4 experts on DGX-2 (16x32GV100), run:
`bash run_moe.sh 4`

To evaluate the resulting model, run:
`bash eval_moe.sh output-4 0 /mnt/azstorage/wmt14_en_de_joined_dict/`
