# ZipLM: Inference-Aware Structured Pruning of Language Models

## Steps to reproduce our results from Table 2
1. conda create --name ziplm python=3.9 -y
2. conda activate ziplm
3. conda install pytorch==1.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia -y
4. pip install transformers datasets accelerate
5. run script
```shell
bash run_ziplm.sh
```
6. To reproduce our GLUE oneshot results, just call these two lines of code in the `run_glue.py` example from HuggingFace transformers library
```python
from ziplm import oneshot_prune
    oneshot_prune(trainer, model, model_args.ziplm_target, model_args.loader_batchsize, model_args.loader_nsamples, model_args.timings_file)
```