

# Kron-LoRA: hybrid Kronecker–LoRA adapters for scalable, sustainable fine-tuning

This repository is the official implementation of Kron-LoRA: hybrid Kronecker–LoRA adapters for scalable, sustainable fine-tuning.


## Requirements

To install requirements:

```setup
pip install -r requirements.txt
```


## Usage

All training and evaluation is driven by `main_hellaswag_mistral.py`. We have both Kron-LoRA and LoRA adapters implemented, users can access them through `--adapter_type kron` or `--adapter_type lora`. As mentioned in the paper, `gradient_accumulation_steps` might change the accuracy (for both adapters), so we don't use it here.

```
python main_hellaswag_mistral.py \
  --adapter_type kron,  # or lora
  --model_name_or_path mistralai/Mistral-7B-v0.1,
  --lora_r 8 \ #only used when apply LoRA
  --train_batch_size 16 \
  --eval_batch_size 8 \
  --lr 3e-4 \
  --dropout 0.1 \
  --alpha 32.0 \
  --eval_strategy epoch \ #if here has error, change the name to evaluation_strategy, also in the main_hellaswag_mistral.py.
  --save_strategy no \ #currently checkpoints are unsaved, change to epoch or steps if need checkpoints.
  --load_best_model_at_end False \ #since no checkpoints are saved, so set this to False, if have saved checkpoints, can change this to True.
  --epoch_num 16 \ #num of epoch
  --WANDB_DISABLED true \
  --HF_TOKEN YOUR_HUGGINGFACE_TOKEN \ #need a huggingface token for accessing Mistral v0.1.
  --output_dir ./checkpoints/mistral_hellaswag

```


## Results

Mistral-7B test accuracy (%) at the epoch of best validation performance:

| Adapter       | #Params    | Avg. (%)  | PIQA      | HellaSwag | WinoGrande | ARC‑Easy  | ARC‑Challenge |
| ------------- | ---------- | --------- | --------- | --------- | ---------- | --------- | ------------- |
| LoRA-4        | 10.63 M    | 74.28     | 85.26     | 84.23     | 80.58      | 73.86     | 47.49         |
| LoRA-8        | 21.26 M    | 77.42     | 85.96     | 86.15     | 81.45      | 76.67     | 56.86         |
| LoRA-16       | 42.52 M    | 78.24     | 85.64     | 88.00     | 81.45      | 78.60     | 57.53         |
| **Kron-LoRA** | **5.71 M** | **77.01** | **85.53** | **86.30** | **81.22**  | **76.84** | **55.18**     |


## License

This code is released under the [MIT License](LICENSE). 
