# D$^2$TR

This is the repository for the paper :  `Decentralized and Disentangled Task–Role Representation Learning for Generalizable Offline Multi-Agent Meta Reinforcement Learning`.

## Algorithm projects

### Installation Instructions

1. Create Environment
    
    Please first create the conda environment by using:
    
    ```bash
    conda create -n offpymarl python=3.10 -y
    ```
    
    Then, install packages in requirements by using:
    
    ```bash
    conda activate offpymarl
    pip install -r requirements.txt
    ```
    
2. Install SC2 and SMAC, SMACv2
    
    First, install SC2 by using:
    
    ```bash
    bash install_sc2.sh # if you have not installed StarCraftII on your machine
    ```
    
    Then, install SMAC and SMACv2 by using:
    
    ```bash
    bash install_smac.sh
    ```
    
    Also, download the `32x32_flat.SC2Map` map file in your `SMAC_Maps` folder. You can download the `SMAC_Maps` folder [here](https://github.com/oxwhirl/smacv2/releases/tag/maps#:~:text=3-,SMAC_Maps.zip,-503%20KB).
    

### Offline Data Collection

We have provided the data collection scripts, just running 

```bash
bash collect_expert.sh #CN Target, CN Agent
bash collect_smac_expert.sh #SMAC Environments
bash collect_smac_v2_expert{1/2/3}.sh #SMACv2 Environments
```

and fill all `offline_data_ls` in the code.

### Offline Training

First, train the task encoder by

```bash
CUDA_VISIBLE_DEVICES=0 python src/main.py --offline_meta --config=meta_encoder --env-config={env_name} --seed=1 --save_model=True &
```

and fill the `encoder_path_ls` in config files.

Then, for CN environments, just obtain the role label using the code stated in the paper; for SMAC and SMACv2 environments, fist apply trajectory pre-clustering to all tasks and environments by

```bash
CUDA_VISIBLE_DEVICES=0 python src/main.py --offline_role_prior --config=offline_role_prior --env-config={env_name} --seed=1 --task={task_name} --save_model=True --min_batch_size_clustering={2000/4000/8000} --batch_size_clustering={6000/12000/24000} --n_min_cluster={n_min_cluster} --n_cluster={n_cluster} &
```

the cluster results will be saved at `results/offline_role_prior/xxx/models/xxx/prior_role_id.npz` , and fill the `map2prior_role_id` in the code.

Then, get the role label by

```bash
python label_gpt_{env}.py
```

we here provide `label_gpt_smac1.py` , for other environments, it is similar. Please fill the `data_path = ""` in this code, where the role data will be saved. Then, fill the `role_data_root` in config files with this `data_path`.

Next, train the role encoder by

```bash
CUDA_VISIBLE_DEVICES=0 python src/main.py --offline_meta --config=meta_role_encoder --env-config={env_name} --seed=1 --save_model=True &
```

and fill the `role_encoder_path_ls` in config files.

Subsequently, train the prior role encoder by

```bash
CUDA_VISIBLE_DEVICES=0 python src/main.py --offline_meta --config=meta_prior_role_encoder --env-config={env_name} --seed=1 --save_model=True &
```

and fill the `prior_role_encoder_path_ls` in config files.

Finally, train the meta policy by

```bash
CUDA_VISIBLE_DEVICES=0 python src/main.py --offline_meta --config=meta_omiga_updet --env-config={env_name} --seed=1 --save_model=True &
```

and fill the `policy_path_ls` in config files.

### Online Adaptation

For online adaptation, simply run

```bash
CUDA_VISIBLE_DEVICES=0 python src/main.py --offline_meta --config=meta_omiga_updet_adaptation --env-config={env_name} --seed=1 &
```

## Note

The implementation is based on [[Offpymarl](https://github.com/zzq-bot/offline-marl-framework-offpymarl)] which is open-sourced.