# Disentangled Multi-Agent World Models (DMAWM)
This repository contains code for our peer-reviewed work DMAWM (Learning Disentangled Multi-AgentWorld Model for Decentralized Control) based on PyTorch, aiming at helping others to reproduce the results in the paper.


## Overview
DMAWM is a model-based Multi-Agent Reinforcement Learning (MARL) method designed for learning decentralized policies in a latent space. The architecture of DMAWM consists of independent agent modules and a shared environment module.

![Illustration of agent and environment models](assets/Illustration_agent_and_environment_models.png)

During decentralized execution, each agent module independently updates its internal state based on local information. During centralized training, the environment module is trained to mirror the independent behavior of the agent modules, effectively disentangling individual latent states from the interaction dynamics while capturing agent interactions.
Consequently, imaginary rollouts generated by the environment module more faithfully simulate decentralized execution dynamics, facilitating the transfer of policies from imagination to decentralized execution.

## Usage

### Installation

First, create and activate a Conda virtual environment with Python 3.11:

```bash
conda create -n dmawm python=3.11
conda activate dmawm
```

For StarCraft Multi-Agent Challenge (SMAC) environments, please follow the installation instructions for StarCraft II version 2.4.10 (Linux) provided in the [pymarl repository](https://github.com/oxwhirl/pymarl). Ensure that the `SC2PATH` environment variable is set to your StarCraft II installation directory (e.g., `~/pymarl/3rdparty/StarCraftII`).

For SMACv2, download the additional maps from the [SMACv2 releases page](https://github.com/oxwhirl/smacv2/releases/tag/maps) and place them in the `$SC2PATH/Maps/SMAC_Maps` directory.

Then, install the dependencies:

```bash
pip install -r requirements.txt
```

### Running Experiments

To reproduce the results on the SMACv2 `protoss_5_vs_5` map, execute the following command:

```bash
CUDA_VISIBLE_DEVICES=0 python main.py \
    --name train \
    --trainer dreamer \
    --env smacv2 \
    --env_args.map_name protoss_5_vs_5 \
    --env_args.use_absorbing_state True \
    --env_args.trailing_absorbing_state_length 2 \
    --train.num_env_steps 405000 \
    --use_eval True \
    --replay.capacity 250000 \
    --critic.use_critic_transformer True \
    --train.share_critics True \
    --train.share_actors True \
    --seed 0
```

Alternatively, scripts for reproducing experimental results are available in the `scripts/` directory. For example:

```bash
sh scripts/dreamer/smacv2_protoss_5_vs_5.sh
```

### Configuration

Configuration files are located in the `config/` directory. You can modify these files directly or override settings using command-line arguments. 

### Logging

Experiments are tracked using [Weights & Biases (wandb)](https://wandb.ai/). To set up your wandb account, run the following command in your terminal and follow the prompts:

```bash
wandb login
```

## Acknowledgements

This work is benefited from the following open-source projects:

- [dreamerv3](https://github.com/danijar/dreamerv3)
- [dreamerv3-torch](https://github.com/NM512/dreamerv3-torch)
- [on-policy](https://github.com/marlbenchmark/on-policy)
