# Training Tiny Recursive Models (TRM) on N-Queens

This package provides everything you need to train the Tiny Recursive Model on the N-Queens problem.

## What's Included

### 1. Dataset Builder
**File:** `TinyRecursiveModels/dataset/build_nqueens_dataset.py`

A complete N-Queens dataset builder that follows TRM's data format conventions. It generates:
- Training and test sets with configurable sizes
- Partial or complete puzzle inputs
- Valid N-Queens solutions as labels
- Data augmentation using dihedral symmetries (rotations + reflections)

### 2. Training Script
**File:** `run_nqueens_trm.sh`

A comprehensive bash script with step-by-step comments that:
- Sets up the environment
- Generates the N-Queens dataset
- Trains the TRM model
- Saves checkpoints and logs

Every step is thoroughly documented with explanations of what's happening and why.

### 3. Quick Start Guide
**File:** `NQUEENS_TRM_QUICKSTART.md`

A detailed guide covering:
- Overview of TRM and N-Queens
- Prerequisites and system requirements
- Quick start instructions
- Detailed explanations of all parameters
- Customization options
- Troubleshooting common issues
- Performance expectations

### 4. Test Script
**File:** `test_nqueens_dataset.py`

A Python script to verify the dataset builder works correctly:
- Generates a small test dataset
- Validates all solutions
- Visualizes example puzzles
- Checks data format

## Quick Start

### Fastest Way to Get Started

```bash
# 1. Navigate to TinyRecursiveModels
cd /home/ubuntu/TinyRecursiveModels

# 2. Run the complete pipeline
bash /home/ubuntu/run_nqueens_trm.sh
```

This single command handles everything: environment setup, dataset generation, and training.

### Manual Step-by-Step

If you prefer to run each step separately:

```bash
# 1. Install dependencies
cd /home/ubuntu/TinyRecursiveModels
pip install -r requirements.txt
pip install --no-cache-dir --no-build-isolation adam-atan2

# 2. Generate dataset
python dataset/build_nqueens_dataset.py \
  --board-size 8 \
  --num-train 800 \
  --num-test 200 \
  --output-dir data/nqueens-8x8-1k-aug-8

# 3. Train model
python pretrain.py \
  arch=trm \
  data_paths="[data/nqueens-8x8-1k-aug-8]" \
  evaluators="[]" \
  epochs=50000 \
  eval_interval=5000 \
  lr=1e-4 \
  puzzle_emb_lr=1e-4 \
  weight_decay=1.0 \
  puzzle_emb_weight_decay=1.0 \
  arch.L_layers=2 \
  arch.H_cycles=3 \
  arch.L_cycles=6 \
  +run_name=nqueens_trm_8x8 \
  ema=True
```

### Test the Dataset Builder

Before running the full pipeline, you can test the dataset builder:

```bash
python /home/ubuntu/test_nqueens_dataset.py
```

This will generate a small test dataset and verify all solutions are valid.

## File Locations

After setup, your files will be organized as follows:

```
/home/ubuntu/
├── TinyRecursiveModels/              # Main TRM repository
│   ├── dataset/
│   │   └── build_nqueens_dataset.py  # N-Queens dataset builder (NEW)
│   ├── data/
│   │   └── nqueens-8x8-1k-aug-8/     # Generated dataset
│   ├── outputs/
│   │   └── nqueens_trm_8x8/          # Training outputs
│   └── pretrain.py                    # Main training script
│
├── run_nqueens_trm.sh                 # Complete pipeline script
├── test_nqueens_dataset.py            # Dataset validation script
├── NQUEENS_TRM_QUICKSTART.md          # Detailed guide
└── NQUEENS_TRM_README.md              # This file
```

## Understanding the Components

### Dataset Format

**Input:** Partial N-Queens board
- Flattened to 1D sequence (8×8 → 64 elements)
- Vocabulary: 0 (PAD), 1 (empty), 2 (queen)
- Example: `[1, 1, 2, 1, ...]` = empty, empty, queen, empty, ...

**Label:** Complete valid solution
- Same format as input
- All N queens correctly placed
- No conflicts (rows, columns, diagonals)

**Augmentation:** 8 variants per puzzle
- Identity, 90°, 180°, 270° rotations
- Horizontal flip, vertical flip
- Transpose, anti-diagonal reflection

### TRM Architecture

The model uses recursive reasoning:

```
Initialize: x (input), y (answer), z (latent)

For H_cycles iterations:
    For L_cycles iterations:
        Update z based on (x, y, z)  # Think
    
    Update y based on (y, z)          # Revise answer

Return final y
```

**Key Parameters:**
- `L_layers=2`: Network depth
- `H_cycles=3`: Number of answer revisions
- `L_cycles=6`: Thinking steps per revision

### Training Process

**Duration:** ~12-24 hours on single GPU

**Metrics:**
- `train_loss`: Training error (should decrease)
- `test_loss`: Test error (should decrease)
- `test_accuracy`: % of correctly solved puzzles

**Checkpoints:** Saved in `outputs/nqueens_trm_8x8/`

## Customization Examples

### Different Board Sizes

**4×4 (Easy):**
```bash
python dataset/build_nqueens_dataset.py \
  --board-size 4 \
  --num-train 200 \
  --num-test 50 \
  --output-dir data/nqueens-4x4
```

**10×10 (Hard):**
```bash
python dataset/build_nqueens_dataset.py \
  --board-size 10 \
  --num-train 2000 \
  --num-test 500 \
  --output-dir data/nqueens-10x10
```

### Difficulty Levels

**Empty board (hardest):**
```bash
python dataset/build_nqueens_dataset.py \
  --board-size 8 \
  --num-given-queens 0
```

**Half-filled (medium):**
```bash
python dataset/build_nqueens_dataset.py \
  --board-size 8 \
  --num-given-queens 4
```

### Model Capacity

**Deeper reasoning:**
```bash
python pretrain.py \
  arch.H_cycles=5 \
  arch.L_cycles=8 \
  ...
```

**Larger network:**
```bash
python pretrain.py \
  arch.L_layers=3 \
  ...
```

## Multi-GPU Training

For faster training with multiple GPUs:

```bash
torchrun --nproc-per-node 4 --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 pretrain.py \
  arch=trm \
  data_paths="[data/nqueens-8x8-1k-aug-8]" \
  evaluators="[]" \
  epochs=50000 \
  eval_interval=5000 \
  lr=1e-4 \
  puzzle_emb_lr=1e-4 \
  weight_decay=1.0 \
  puzzle_emb_weight_decay=1.0 \
  arch.L_layers=2 \
  arch.H_cycles=3 \
  arch.L_cycles=6 \
  +run_name=nqueens_trm_8x8 \
  ema=True
```

Replace `4` with your number of GPUs.

## Troubleshooting

### Common Issues

**"CUDA out of memory"**
- Reduce batch size: Add `global_batch_size=64`
- Use smaller model: Reduce layers/cycles
- Use CPU (slow): Remove CUDA requirements

**"No module named 'adam_atan2'"**
- Install: `pip install --no-cache-dir --no-build-isolation adam-atan2`

**Training very slow**
- Check GPU usage: `nvidia-smi`
- Disable W&B: `export WANDB_MODE=disabled`
- Use multi-GPU training

**Model not learning**
- Check dataset: `python test_nqueens_dataset.py`
- Increase learning rate: `lr=5e-4`
- Reduce weight decay: `weight_decay=0.1`

For more troubleshooting, see `NQUEENS_TRM_QUICKSTART.md`.

## Expected Performance

### 8×8 N-Queens
- **Training time:** 12-24 hours (single GPU)
- **Expected accuracy:** 70-90%
- **Model size:** ~7M parameters

### 10×10 N-Queens
- **Training time:** 24-48 hours
- **Expected accuracy:** 50-70%
- **Requires:** More data, deeper model

### 12×12 N-Queens
- **Training time:** 48+ hours
- **Expected accuracy:** 30-50%
- **Very challenging:** Needs significant compute

## How It Works

### Why TRM for N-Queens?

Traditional approaches to N-Queens:
- **Backtracking:** Fast but not learnable
- **Constraint solvers:** Efficient but domain-specific
- **Large models:** Overkill and expensive

TRM offers a middle ground:
- **Small:** Only 7M parameters
- **Learnable:** Trains from scratch on puzzles
- **Recursive:** Iteratively improves solutions
- **Generalizable:** Can adapt to different board sizes

### The Recursive Reasoning Process

1. **Initial guess:** Model makes a first attempt at placing queens
2. **Evaluate:** Check for conflicts (same row/column/diagonal)
3. **Revise:** Update queen placements to fix conflicts
4. **Repeat:** Iterate multiple times (H_cycles)
5. **Output:** Final configuration

Each revision cycle involves:
- Updating internal reasoning state (L_cycles)
- Adjusting queen placements based on constraints
- Learning from mistakes in previous iterations

### Why It Works

N-Queens is perfect for recursive reasoning because:
- Solutions can be built incrementally
- Conflicts can be detected and fixed
- Multiple reasoning steps help avoid local minima
- Symmetries (rotations/reflections) provide more training data

## References

**TRM Paper:**
- "Less is More: Recursive Reasoning with Tiny Networks"
- Alexia Jolicoeur-Martineau, 2025
- https://arxiv.org/abs/2510.04871

**TRM Repository:**
- https://github.com/SamsungSAILMontreal/TinyRecursiveModels

**N-Queens Problem:**
- https://en.wikipedia.org/wiki/Eight_queens_puzzle

## Next Steps

1. **Run the test script** to verify everything works:
   ```bash
   python /home/ubuntu/test_nqueens_dataset.py
   ```

2. **Start training** with the complete pipeline:
   ```bash
   bash /home/ubuntu/run_nqueens_trm.sh
   ```

3. **Monitor progress** by checking the logs:
   ```bash
   tail -f outputs/nqueens_trm_8x8/train.log
   ```

4. **Experiment** with different configurations:
   - Try different board sizes (4, 6, 10, 12)
   - Adjust model capacity (layers, cycles)
   - Vary difficulty (num_given_queens)

5. **Analyze results** to understand:
   - Which puzzles are easy vs. hard?
   - How many reasoning cycles are needed?
   - Does the model generalize to larger boards?

## Support

For questions or issues:
1. Check the troubleshooting section in `NQUEENS_TRM_QUICKSTART.md`
2. Review the detailed comments in `run_nqueens_trm.sh`
3. Test the dataset builder with `test_nqueens_dataset.py`
4. Consult the TRM paper for architecture details

## License

This code builds upon TinyRecursiveModels, which is based on the Hierarchical Reasoning Model. Please refer to the original repositories for license information.
