## Guided Star-Shaped Masked Diffusion (G-Star)

Implementation of GUIDED STAR-SHAPED MASKED DIFFUSION in this fork, built on top of the MDLM codebase. This repository provides simple scripts to train the denoiser, train the remasker, and run sampling experiments (including star-shaped sampling).

### Setup
Use the setup script to create the virtual environment and install dependencies:

```bash
bash setup.sh
```

If needed later, reactivate the environment with:

```bash
source .venv/bin/activate
```

### Train
- Denoiser training:

```bash
bash ./scripts/train.sh
```

- G-Star training:

```bash
bash ./scripts/run_remaskator_training.sh
```

Notes:
- Replace placeholders in the scripts as needed:
  - `<wandb_api_key>`: your Weights & Biases API key
  - `<path>`: paths for checkpoints, output directories, and caches
- Data/configs are controlled via `configs/` (e.g., `configs/data/openwebtext-split.yaml`).

### Sampling
All sampling scripts are in:

```
scripts/sampling
```

Examples:
- Star-shaped sampling (sequence length 512):

```bash
bash scripts/sampling/sample_star_shape512.sh
```

- Star-shaped sampling (sequence length 128):

```bash
bash scripts/sampling/sample_star_shape.sh
```

You can also explore additional scripts for guided/temperature sweeps and REMDM variants in the same folder. Many scripts expose knobs such as `T_ON`, `T_OFF`, `ALPHA_ON`, `sampling.nucleus_p`, and `sampling.remaskator_temperature` for controlled guidance and schedule shaping.

### Repository layout (fork-specific entry points)
- `setup.sh`: environment setup
- `scripts/train.sh`: denoiser training
- `scripts/run_remaskator_training.sh`: remasker training
- `scripts/sampling/`: sampling experiments, including star-shaped variants

### Acknowledgements
This work extends the MDLM framework and codebase released by the original authors.



