# Matrix Multiplication Project

## How to Run

This project contains several training scripts and SLURM job scripts for running experiments on matrix multiplication tasks with different neural architectures.

### 1. Python Training Scripts

You can run the training scripts directly with Python. Example:

```bash
python3 -u train_rnn_relu.py --data_dir data/mm_T100s --m_max 60 --cuda \
  --batch_size 256 --amp --amp_dtype bf16 \
  --num_workers 2 --prefetch_factor 4 --persistent_workers \
  --eval_every 2 --max_steps 30000
```

Replace `train_rnn_relu.py` with any of the following, depending on your experiment:
- `train_rnn_relu.py`
- `train_transformer.py`
- `train_rwkv7.py`
- `train_deltanet.py`
- `train_mamba.py`

Each script supports various command-line arguments (see the top of each script or use `--help`).

### 2. SLURM Job Scripts

For cluster environments with SLURM, use the provided shell scripts:

```bash
sbatch rnn.sh
sbatch transformer.sh
sbatch rwkv.sh
sbatch delta.sh
sbatch mamba.sh
```

You can edit these scripts to adjust hyperparameters, data paths, or environment variables as needed.

### 3. Data

Datasets are located in the `data/` directory. Each subfolder contains `train`, `val`, and `test` splits. Update the `--data_dir` argument or the corresponding variable in the shell scripts to point to the desired dataset.

### 4. Requirements

- Python 3.8+
- PyTorch
- numpy
- tqdm

Install dependencies (if not using a cluster-provided environment):

```bash
pip install torch numpy tqdm
```

### 5. Results & Logs

Model checkpoints and logs are saved in the working directory or as specified in the scripts.

### 6. Constructing a Dataset

To generate a dataset, use the `gen.py` script. Example command:

```bash
python3 gen.py --out_dir data/mm_stepwise_m29_qk0 --m 29 --qk 0 --T_train 50 --T_gap 100 --seed 0
```

- `--out_dir`: Output directory for the generated dataset
- `--m`: Modulus (preferably a prime, e.g., 29)
- `--qk`: Query index (e.g., 0)
- `--T_train`: Maximum sequence length for training
- `--T_gap`: Gap for test sequence lengths
- `--seed`: Random seed for reproducibility

Adjust these arguments as needed for your experiments. The generated data will appear in the specified directory, ready for use with the training scripts.

---

For more details, see comments at the top of each script or script header docstrings.
