# Offline Multi-agent Continual Cooperation via Skill Partition and Reuse

This is the implementation of the submitting paper "Offline Multi-agent Continual Cooperation via Skill Partition and Reuse". 

## Installation instructions

### Install StarCraft II

Set up StarCraft II and SMAC:

```bash
bash install_sc2.sh
```

This will download SC2.4.10 into the 3rdparty folder and copy the maps necessary to run over. You may also need to persist the environment variable `SC2PATH` (e.g., append this command to `.bashrc`):

```bash
export SC2PATH=[Your SC2 folder like /abc/xyz/3rdparty/StarCraftII]
```

### Install Python environment

Install Python environment with conda:

```bash
conda create -n comad python=3.10 -y
conda activate comad
pip install -r requirements.txt
```

### Configure SMAC package

We extend the original [SMAC](https://github.com/oxwhirl/smac) package by adding additional maps for multi-task evaluation. Here are a simple script to make some modifications in `smac` and copy additional maps to StarCraft II installation. Please make sure that you have set `SC2PATH` correctly.

```bash
git clone https://github.com/oxwhirl/smac.git
pip install -e smac/
bash install_smac_patch.sh
```

## Run experiments
### Data Collection
As the datasets are large, we recommend collecting data with the standardized collection script, e.g.
```bash
python src/main.py --collect --config=qmix --env-config=sc2_collect --offline_data_quality=expert --num_episodes_collected=2000 --map_name=5m_vs_6m --save_replay_buffer=False
```

To collect medium data, you may specify a `stop_winrate` where the policy will start to collect data after reaching the test winrate.

```bash
python src/main.py --collect --config=qmix --env-config=sc2_collect --offline_data_quality=medium --num_episodes_collected=2000 --map_name=5m_vs_6m --save_replay_buffer=False stop_winrate=0.5
```

### Running COMAD
You can execute the following command to run COMAD with any specific task config, which will perform training on the corresponding data:

```bash
python src/main.py --transfer --config=comad --env-config=cn_transfer --task-config=cn_cont_expert --cont_train_steps=20000 --stage1_steps=10000
```

The `--task-config` flag can be followed with any existing config name in the `src/config/tasks/` directory, and any other config named `xx` can be passed by `--xx=value`. 

All results will be stored in the `results` folder. You can see the console output, config, and tensorboard logging in the cooresponding directory.

---

Our code is built upon ODIS, please refer to [https://github.com/LAMDA-RL/ODIS](https://github.com/LAMDA-RL/ODIS).
