# Supplementary Material: Memory-Efficient PRNG Training

This supplementary material contains scripts for training transformer models on pseudorandom number generation (PRNG) tasks.

## Contents

- `prng_train_lowmem.py` - Memory-efficient single-modulus training script
- `curriculum_lowmem.py` - Multi-modulus curriculum learning script  
- `utils/` - Supporting utility modules for data generation, curriculum learning, and training

## Supported PRNG Types
- **LCG (Linear Congruential Generator)**: Standard linear congruential generators
- **TLCG (Truncated LCG)**: LCG with truncated output bits
- **PCG variants**: PCG_RS, PCG_RR, PCG_XSH_RR, PCG_XSH_RS, PCG_XSL_RR


## Quick Start

### Basic Single-Modulus Training

```bash
# Train on XSLRR
python prng_train_lowmem.py \
    --type "XSLRR" \
    --m $((2**18)) \
    --n_a 1024 --n_c 1024 \
    --n_test_a 128 --n_test_c 64 \
    --n_example 1 \
    --seq_len 513 \
    --n_layer 4 --n_head 8 --n_embd 1024 \
    --num_steps 400000 \
    --warm_steps 5000 \
    --batch_size 512 \
    --grad_acc_steps 1 \
    --lr_trgt 0.0001 \
    --weight_decay 0.1 \
    --eval_interval 2000 \
    --num_workers 8 \
    --digits 1 \
    --bits_to_keep 9 \
    --control_bits 3 \
    --base 512 \
    --data_seed 1 \
    --main_seed 1 \
    --save_correctness \
    --save_params
```

### Multi-Modulus Curriculum Learning



```bash
# Train using YAML configuration file
python curriculum_lowmem.py --config configs/curriculum.yaml
```

Example YAML configuration:
```yaml
experiment_name: "Multi-Modulus Curriculum"
description: "Curriculum learning on XSLRR with multiple moduli"

model:
  n_layer: 4
  n_head: 8
  n_embd: 1024
  no_rope: false

data:
  type: "XSLRR"
  moduli: [65536, 262144]
  bits_to_keep: [8, 9]
  seq_len: 513
  n_a: 1024
  n_c: 1024
  n_test_a: 128
  n_test_c: 64
  n_example: 1
  control_bits: 3
  base: 1024
  vocab_size: 1024
  digits: 1

curriculum:
  sampler_update_interval: 10
  phases:
    - name: "Phase 1"
      phase_steps: 40000
      transition_steps: 40000
      warmup_steps: 5000
      start_weights: [0.001, 0.999]
      end_weights: [0, 1]
      transition: "exp"
      lr_decay: "constant"
    - name: "Phase 2"
      phase_steps: 10000
      transition_steps: 0
      warmup_steps: 0
      start_weights: [0, 1]
      end_weights: [0, 1]
      transition: "exp"
      lr_decay: "cosine"

training:
  lr_trgt: 0.0001
  lr_min: 1e-7
  batch_size: 256
  grad_acc_steps: 1
  weight_decay: 0.1
  eval_interval: 2000
  beta1: 0.9
  beta2: 0.999

seeds:
  main_seed: 1
  data_seed: 2

output:
  results_dir: "results/curriculum"
  save_checkpoints: false
  checkpoint_interval: 1000
  save_correctness: true
  save_params: true

pretrained:
  pretrained_path: null
  pretrain_m: null

wandb:
  use_wandb: false
  project: "prng_curriculum"
  entity: null
  name: null
  tags: null
  notes: null
```


## Requirements

- Python 3.8+
- PyTorch 2.0+
- NumPy
- Pandas
- PyYAML
- SymPy
- Scikit-learn
- Weights & Biases (optional, for experiment tracking)
- Psutil (optional, cpu monitoring)
