### Path

```
dir:
--Text2Music_data
  --v3.1_20230318
  	--data-bin
  		train.bin
  		valid.bin
  		test.bin
  	train_command.npy
  	valid_command.npy
  	test_command.npy
--Text2Music_main
  --MidiDataExtractor
  --data_process
  --Attri2Music_mask_v5
  --sh
```

### Env

```shell
bash setup.sh
```

Prepare `MidiDataExtractor`

### Data process

Please convert the preprocessed `RID.bin` and `Token.bin` into `numpy` array format, please  For more details, please refer to `split_data.py`.

```shell
cd data_process
python split_data.py
```

### Training

The following command trains the `xl` model on 16 nodes (4 GPUs(>= 16 GB) per node).

```shell
# bash sh/train-mask_v5-xl-datav3-final.sh node_num gpu_num
bash sh/train-mask_v5-xl-datav3-final.sh 16 4
```

### Inference

Make sure the model structure and path of the checkpoint correspond to the script `interactive_mask_v5.sh`. `start_index` and `end_index` are the ranges of input attribute commands.

```shell
# bash sh/interactive_mask_v5.sh xl top_k start_index end_index
bash sh/interactive_mask_v5.sh xl 15 0 5
```

The path to the generated results is `generation/model_name/command_name/topk15/`.

