## SMART: Self-supervised Multi-task pretrAining with contRol Transformers



Pretraining example:

```
python src/dmc_multidomain_train.py --devices 4 --nodes 1 \
        --epochs 10 --n_embd 256 --batch_size 256 --lr 6e-4 \
        --num_steps 80000 --n_layer 8 --n_head 8 --seed 123 \
        --multi_config configs/train_configs/multipretrain_source_v1.json \
        --model_type naive --unsupervise --inverse --forward --rand_inverse \
        --rand_mask_size -1 --mask_obs_size -1 --train_replay_id 5 \
        --biased_multi 
```



RTG Finetuning example:

```
python src/dmc_train.py --devices 4 --nodes 1 --seed 123 \
        --epochs 20 --n_embd 256 --lr 6e-4 --batch_size 256 \
        --domain cheetah --task run --num_steps 1000000 --select_rate 0.1 \
        --model_type reward_conditioned --bc --train_replay_id 2 --rand_select \
        --load_model_from PRETRAIN_PATH/checkpoints/last.ckpt --no_load_action 
```



BC Finetuning example:

```
python src/dmc_train.py --devices 4 --nodes 1 --seed 123 \
        --epochs 20 --n_embd 256 --lr 6e-4 --batch_size 256 \
        --domain cheetah --task run --num_steps 1000000 --select_rate 0.1 \
        --model_type naive --bc --train_replay_id 2 --rand_select \
        --load_model_from PRETRAIN_PATH/checkpoints/last.ckpt --no_load_action 
```

