## This dir contains the source code of QA-LoRA.

### qa-lora.py is the main runing file of the code.
### perft_utils.py contains the new lora algorithm
### merge.py is the script to merge the quantized model and save LoRA parameters. 


## How to run?
conda create -n gptqlora python=3.8
conda activate gptqlora
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
git clone -b peft_integration https://github.com/PanQiWei/AutoGPTQ.git && cd AutoGPTQ
pip install .[triton]
cd ..
git clone https://github.com/timdettmers/bitsandbytes.git
cd bitsandbytes
# CUDA_VERSIONS in {110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 120}
# make argument in {cuda110, cuda11x, cuda12x}
# if you do not know what CUDA you have, try looking at the output of: python -m bitsandbytes
CUDA_VERSION=117 make cuda11x
python setup.py install
cd ..
pip install git+https://github.com/huggingface/transformers.git
pip install git+https://github.com/huggingface/peft.git
pip install git+https://github.com/huggingface/accelerate.git
pip install -r requirements.txt
pip install protobuf==3.20.*
###change the perft_utils.py in the auto_gptq with the one in the dir

python gptqlora.py --model_path <path>