# Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution


This is the offical implementation for the submission titled Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution.

## Requirements
- torch==2.1.0+cu118
- transformers==4.38.0
- huggingface-hub==0.27.0
- lm_eval==0.4.2

## Quick start


### Sparsification

Before fine-tuning, SEFT employs a simple but effective pruning approach [Wanda](https://arxiv.org/abs/2306.11695) to 
sparsify the language model, serving as the base model (frozen) for sparse finetuning.

Below is an example command for unstructured sparsifying [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) 
with Wanda to achieve unstructured 50% sparsity.
```bash
python wanda/main.py \
    --model meta-llama/Meta-Llama-3-8B \
    --prune_method wanda \
    --sparsity_ratio 0.7 \
    --sparsity_type unstructured \
    --save wanda_out \
    --save_model <path to sparse base model>
```
- `--model`: The identifier for the model on the Hugging Face model hub or local path.
- `--sparsity_ratio`: Specifies the percentage of weights to be pruned.
- `--save_model`: Specifies the directory where the sparsified language model will be stored.


### Fine-Tuning

```bash
python -u lora_ft/finetune_split.py \
    --model_name_or_path $sparse_DIR \
    --data_used 'only_sft' \
    --num_train_epochs 1 \
    --block_size 512 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 8 \
    --do_train \
    --seed 42 \
    --use_qpeft False \
    --peft_type sft \
    --sparsity_ratio $sparsity_ratio \
    --sft_num_deltas $NUM_TUNABLE_WEIGHTS \
    --sft_reselection_steps 60 \
    --sft_selection_accumulation_steps 5 \
    --max_train_samples 30000 \
    --max_eval_samples 128 \
    --learning_rate 1e-3 \
    --initial_reselection_rate 0.2 \
    --weight_decay 0 \
    --dataset_path 'c4' \
    --pruned True \
    --output_dir $OUTPUT_DIR > $OUTPUT_DIR/log.txt 2>&1
```
- `--model_name_or_path`: The identifier for the model on the Hugging Face model hub or local path for fine-tuning.
- `--sparsity_ratio`: Specifies the percentage of weights to be pruned for the base model.
- `--sft_num_deltas`: Specifies the number of weights for finetuning.
- `--output_dir`: Specifies the directory where the finetuned model will be stored.

### Evaluation

```bash
python ./lora_ft/evaluate_tasks.py \
    --model $sparse_DIR \
    --few_shots $shots \
    --task $tasks \
    --lora_weights $OUTPUT_DIR >> $OUTPUT_DIR/log_eval.txt 2>&1
```
- `--model`: The identifier for the model on the Hugging Face model hub or local path before fune-tuning.
- `--few_shots`: Specifies the number of shots for evaluation.
- `--task`: Specifies tasks for evaluation.
- `--lora_weights`: Specifies the directory for the model after fune-tuning.



## Acknowledgements
We appreciate your valuable time for reviewing our submission.​        
​    
