# README.md

## 🧩 Project Overview

This repository accompanies the paper
***“Generative Evolutionary Meta-Solver (GEMS): Scalable Surrogate-Free Multi-Agent Learning”***,
submitted to **Transactions on Machine Learning Research (TMLR)**.

It contains complete implementations of **GEMS** and multiple **PSRO-based baselines**, designed to study **meta-strategic exploration**, **policy-space evolution**, and **multi-agent equilibrium dynamics** in diverse environments.

The repository provides a unified, reproducible framework to replicate all quantitative and qualitative results presented in the paper.

---

## 📂 Folder Structure

```
project_root/
│
├── deceptive_mean/      # Deceptive Messages environment (GEMS & PSRO)
│   ├── gems.py
│   ├── psro.py
│   ├── apsro.py
│   ├── do.py
│   ├── alphapsro.py
│   └── README.md
│
├── khun_poker/          # Kuhn Poker (GEMS & PSRO variants)
│   ├── gems.py
│   ├── psro.py
│   ├── apsro.py
│   ├── neupl.py
│   ├── epsro.py
│   ├── p2sro.py
│   ├── alphapsro.py
│   └── README.md
│
├── pettingzoo/          # PettingZoo MPE environments (simple_tag_v3, simple_spread_v3)
│   ├── run.sh
│   ├── gems.py
│   ├── psro.py
│   └── README.md
│
├── Appendix/            # Chess environment (GEMS-only)
│   ├── chess.py
│   └── Chess Results.gif
│
└── GIFs/                # Visualization outputs for GEMS & PSRO runs
    ├── GEMS/
    │   ├── seed_0.gif
    │   ├── seed_1.gif
    │   ├── seed_2.gif
    │   ├── seed_3.gif
    │   └── seed_4.gif
    │
    └── PSRO/
        ├── seed_0.gif
        ├── seed_1.gif
        ├── seed_2.gif
        ├── seed_3.gif
        └── seed_4.gif
```

---

## 🎥 Visualization Outputs

The `GIFs/` directory contains **animated qualitative evaluations** of agent behavior:

* **`GIFs/GEMS/`** → GEMS-generated runs
* **`GIFs/PSRO/`** → PSRO baseline runs

Each folder contains runs for **seeds 0–4**, showing consistent diversity across random initializations.

Additionally, the **`Appendix/`** folder includes **Chess Results.gif**, a GEMS-only demonstration illustrating emergent strategy learning in a deterministic Chess environment.

---

## ⚙️ Installation & Environment Setup

To reproduce **all results from the TMLR paper**, use the provided `environment.yml` (preferred for reviewers)
or `requirements.txt` (for custom setups).

---

### 🧩 1. Create a Conda Environment (Recommended)

Using the included **`environment.yml`** guarantees full reproducibility of the experiments presented in the paper.

```bash
# Create environment from YAML
conda env create -f environment.yml

# Activate environment
conda activate gems
```

To verify CUDA availability:

```bash
python -c "import torch; print(torch.cuda.is_available())"
```

If you see `True`, your GPU setup is correctly configured.

---

### ⚙️ 2. Manual Setup (Alternative)

If you prefer manual installation:

```bash
conda create -n gems python=3.11.9
conda activate gems
```

Install core dependencies:

```bash
# Install GPU build of PyTorch
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

# Install remaining dependencies
pip install -r requirements.txt
```

For CPU-only users:

```bash
pip install torch numpy matplotlib tqdm
```

---

## 📦 Dependency Summary

| Package                      | Version     | Purpose                        |
| ---------------------------- | ----------- | ------------------------------ |
| **Python**                   | 3.11.9      | Core interpreter               |
| **PyTorch**                  | 2.8.0+cu128 | Deep learning backend          |
| **NumPy**                    | 2.1.3       | Numerical computing            |
| **Matplotlib**               | 3.10.0      | Visualization                  |
| **PettingZoo**               | latest      | Multi-agent MPE environments   |
| **Gymnasium / Pygame**       | latest      | Rendering & simulation         |
| **Pandas, Seaborn, MoviePy** | latest      | Data & visualization utilities |

---

## ▶️ Running Experiments

Each experimental domain (`deceptive_mean`, `khun_poker`, `pettingzoo`) includes its own `README.md`
describing available arguments, seeds, and result formats.

### Example Commands

```bash
# Run GEMS on Kuhn Poker
cd khun_poker
python gems.py --seed 0

# Run PSRO baseline on PettingZoo Simple Tag
cd pettingzoo
python psro.py --env simple_tag_v3 --seed 2

# Run GEMS on Chess environment (Appendix)
cd Appendix
python chess.py
```

All output metrics (CSV, logs, GIFs) will be saved automatically in their respective domain folders.

---

## 🧪 Reproducibility Notes

To exactly reproduce the results reported in the TMLR paper:

1. Create your environment using `environment.yml`.
2. Run all experiments for seeds: `0, 1, 2, 3, 4`.
3. Use consistent hyperparameters from each folder’s `README.md`.
4. Compare the training curves, equilibrium trends, and visualizations with those shown in the main paper.

All experiments were conducted on **CUDA 12.1+ GPUs (A2000)**,
but the setup works seamlessly on any GPU-compatible PyTorch environment.

---

## 🧭 Summary

This repository offers a **scalable and reproducible** framework for studying
**surrogate-free multi-agent learning** under exploration-heavy conditions.

* Unified implementation of **GEMS** and **PSRO** families
* Modular across environments: Deceptive Mean, Kuhn Poker, PettingZoo MPE, and Chess
* Compatible with both GPU (recommended) and CPU setups
* Includes reproducible outputs for all seeds and environments
