# ALS-ActLR: Activation-Aware Low-Rank Compression

This repository provides the implementation of **ALS-ActLR** (Alternating Least Squares based Activation-Aware Low-Rank compression), a post-training compression framework for large language models (LLMs).  
It combines **Spectral-Informed Metric Transformation (SIMT)**, **Activation-Aware ALS factorization**, and **Uncertainty-Weighted Multi-Objective Distillation (UW-MOD)** to achieve efficient compression while preserving accuracy.

---

## Installation

Clone this repository and install dependencies from `requirements.txt`:

```bash
pip install -r requirements.txt
```

We recommend using **Python 3.10+** with **PyTorch >= 2.1** and CUDA support.

---

## Quick Start

We provide a shell script `test.sh` that demonstrates how to compress a pretrained **LLaMA-7B** model using ALS-ActLR.  

Run the following command:

```bash
sh test.sh
```

The script will execute:

1. **Step 1 (SIMT + ALS):** Profile activations and compress the model using Activation-Aware ALS.  
2. **Step 2 (UW-MOD):** Apply uncertainty-weighted knowledge distillation to refine the compressed model.  
3. **Step 3 (Evaluation):** Evaluate the compressed/distilled model on perplexity and commonsense reasoning benchmarks.

---

## Example Workflow 

If you want to run each step manually, here is the command sequence:

```bash
# Step 1: SIMT + ALS compression
python -u ALS_ActLR.py --model jeffwan/llama-7b-hf --step 1 --ratio 0.6 --save_path . --tau 0.003 --rho 0.003 --iter 3

# Step 2: UW-MOD distillation
python -u ALS_ActLR.py --model jeffwan/llama-7b-hf \
    --student_ckpt jeffwan_llama_7b_hf_ratio0.6_tau0.003_rho0.003_iter3.pt \
    --step 2 --save_path ./runs --epochs 50 --kd_loader_batch_size 3

# Step 3: Evaluation
python -u ALS_ActLR.py --model_path student_distilled.pt --step 3
```

---

## Outputs

- **Compressed model checkpoints** will be saved in the specified `--save_path`.  
- **Distilled student checkpoints** will be saved under `runs/`.  
- Evaluation will report:
  - **Perplexity (PPL)** on language modeling datasets (WikiText2, PTB, C4).  
  - **Accuracy (%)** on reasoning benchmarks (ARC-Easy, WinoGrande, HellaSwag, PIQA).  
