# When Data Is Scarce: Scaling Sparse Language Models with Repeated Training


This repository contains the official code for the submission titled When Data Is Scarce: Scaling Sparse Language Models with Repeated Training.

We investigate how **Dynamic Sparse Training (DST)** interacts with **data repetition** in data-constrained pre-training regimes, and introduce sparsity-aware scaling laws to characterize this interaction.

## Install

```bash
pip install -r requirements.txt
```
## Training

### For DST with multiple epochs
```bash
bash scripts/train_llm_dst.sh --epoch 16
```

### For DST without repeated data
```bash
bash scripts/train_llm_dst.sh --epoch 1
```

### For dense training with multiple epochs
```bash
bash scripts/train_llm_dense.sh --epoch 16
```

### For dense training without repeated data
```bash
bash scripts/train_llm_dense.sh --epoch 1
```

The `--epoch` argument controls how many times the model iterates over the dataset, allowing you to vary the degree of data repetition.
