# MisoDICE: Multi-Agent Reinforcement Learning Framework

MisoDICE is a framework for multi-agent reinforcement learning (MARL) research. It includes implementations of various algorithms, environments, and utilities to facilitate the development and evaluation of MARL methods.

## Project Structure

```
.
├── analyze.ipynb               # Analysis notebook
├── analyze_llm.ipynb               # Analysis notebook
├── configs.py                  # Configuration settings for experiments
├── generate_dataset.py         # Script to generate datasets
├── main.py                     # Entry point for running experiments
├── trainer.py                  # Training loop and utilities
├── algos/                      # Algorithms for MARL
│   ├── misodice_continuous.py  # MisoDICE for continuous action spaces
│   ├── misodice_discrete.py    # MisoDICE for discrete action spaces
├── buffers/                    # Replay buffer implementations
│   ├── buffer_continuous.py    # Buffer for continuous action spaces
│   ├── buffer_discrete.py      # Buffer for discrete action spaces
├── dataset/                    # Dataset-related utilities
├── envs/                       # Multi-agent environments
│   ├── mamujoco/               # Multi-agent MuJoCo environments
│   ├── smacv1/                 # StarCraft II environments (version 1)
│   ├── smacv2/                 # StarCraft II environments (version 2)
│   ├── utils.py                # Environment utilities
├── graphs/                     # Graphs and visualizations
├── rollouts/                   # Rollout data
```

## Key Components

### Algorithms
- **MisoDICE**: Implementations for both continuous and discrete action spaces are available in the `algos/` directory.

### Environments
- **Multi-Agent MuJoCo**: Custom multi-agent environments for continuous control tasks.
- **StarCraft II**: Environments for decentralized micromanagement scenarios.

### Utilities
- **Replay Buffers**: Efficient storage and sampling of experience data.
- **Dataset Generation**: Tools for creating datasets for offline RL.

## Getting Started

### Prerequisites
- Python 3.8+
- PyTorch


### Generating Datasets
To generate datasets, use:
```bash
python generate_dataset.py
```

### Running Experiments
To run an experiment, use the `main.py` script:
```bash
python main.py --algo misodice --env_name protoss_5_vs_5 --use_llm
```

### Training
The training loop is implemented in `trainer.py`. Customize the training parameters as needed.

## Documentation
- Supplementary material for MisoDICE is available in [Neurips_2025__MisoDICE_supplementary.pdf](Neurips_2025__MisoDICE_supplementary.pdf).
- Analysis reports can be found in `analyze.pdf` and `analyze_llm.pdf`.

## Contributing
Contributions are welcome! Please submit a pull request or open an issue for any bugs or feature requests.

## License
This project is licensed under the BSD 3-Clause License. See the `LICENSE` file for details.

## Acknowledgments
This framework builds on various open-source libraries and tools. Special thanks to the Jupyter Development Team for their contributions to the ecosystem.