# PUMA

This is the code for PUMA (Progressive Unmasking) for the paper "Stop Training for the Worst: Progressive Unmasking Accelerates Masked Diffusion Training".

## Quick Start

### 1. Install Environment

```bash
# Create and activate conda environment
conda env create -f environment.yml
conda activate puma
```

### 2. Download Data
Download all the files from [this Google drive folder](https://drive.google.com/drive/folders/1TluiZjYl-zLdbxjVmhfWl-WyX_OvD7UW) and put them in the `data/sudoku_new` folder. Next, download the TinyGSM files from [HuggingFace](https://huggingface.co/datasets/TinyGSM/TinyGSM) (this should give you files `labels.bin`, `meta.json`, and `prompt_mask.bin`) and put them in the `data/tiny_gsm` folder. Tokenization of TinyGSM will be handled by our code automatically.

### 3. Run Training

Submit a job using the SLURM script:

```bash
sbatch job.sh
```

The SLURM script calls `train.py`, which handles the training loop.

### 4. Configuration

Config files are located in `yaml_files/`. Edit these YAML files to adjust:
- Model architecture
- Training hyperparameters  
- Dataset settings
- Logging options
We provide one config each for PUMA and the baseline for the following three settings: Sudoku, TinyGSM (standard), TinyGSM (block diffusion).

### 5. Monitoring

Training logs and checkpoints are saved according to the paths specified in your config file. The training file also logs results to wandb.
