# Memory-Efficient LLMs Training with Dynamic Sparsity: From Stability to Practical Scaling


This is the offical implementation for the submission titled Memory-Efficient LLMs Training with Dynamic Sparsity: From Stability to Practical Scaling.

## Requirements
- torch==2.1.0+cu118
- transformers==4.38.0
- huggingface-hub==0.27.0
- lm_eval==0.4.2
- datasets==2.19.1

## Quick start



### Pre-training and Evaluation

Run the following script:
```bash
bash scripts/train_script.sh
```
