
# FD-LoRA: Federated Distillation for LoRA Fine-tuning

This repository provides a simulation framework for "FD-LoRA", a federated learning method that fine-tunes LoRA modules using distillation over a public prompt dataset. It supports non-IID scenarios, rank heterogeneity, and privacy-preserving public dataset construction.

---

## Project Structure

```
.
run.py                     # Entry point for federated training
config.py                 # All configurable parameters (rank, alpha, etc.)
data_loader.py           # Dataset download, preprocessing, and partition
prepare_public.py        # Construction of public prompt dataset (template, obfuscation)
lora_model.py            # LoRA-enhanced RoBERTa model
distill_util.py          # Distillation utility functions (KL loss, softmax)
client.py                # Client-side training logic
server.py                # Server-side aggregation
logger.py                # Logging utilities
metrics.py               # Accuracy, F1, confusion matrix, etc.
utils.py                 # General utilities
plot_utils.py            # Visualization of accuracy, rank drop, etc.
requirements.txt
README.md
```

---

## Getting Started

### Installation
```
pip install -r requirements.txt
```

### Run FD-LoRA Federated Training
```bash
python run.py --dataset mnli --alpha 0.5 --num_rounds 10 --rank 8 --distill_temp 2 --non_iid_level severe
```

See `config.py` for full arguments.

---

## Supported Datasets

- MNLI (matched + mismatched)
- SST-2
- QQP
- QNLI

All datasets are downloaded and partitioned in `data_loader.py`.

---

## Privacy-Preserving Public Dataset Construction

Implemented in `prepare_public.py` using:
- Template-based prompt generation
- Prompt obfuscation and anonymization
- Differentially private logit sharing (optional)

---

## Metrics and Visualization

- Local and global accuracy
- Distillation loss tracking
- Communication cost analysis
- Per-rank accuracy drop

---
