# CLAD: Continual Learning Adversarial Detection & Defense

A research‑grade implementation of the **CLAD** framework — *Continual Learning for Adversarial text Detection and Defense*. The code focuses on the **detector / classifier / defense loop** and **replay‑based continual learning** components.

> **Reference paper:** *CLAD: Continual Learning for Robust Adversarial Text Detection and Repair in Resource‑Constrained Scenarios*

---

## ✨ Key Features

| Module                 | Purpose                                                                                                                           |
| ---------------------- |-----------------------------------------------------------------------------------------------------------------------------------|
| `load_text_dataset.py` | Unified loader (`Dataset`, `CLSDataset`) for natural & adversarial samples with `$LABEL$` splitter and on‑the‑fly tokenisation.   |
| `model.py`             | Core architectures: `DetModel`, `CLSModel`, `ADModel` (joint detector + classifier). All inherit from 🤗 Transformers back‑bones. |
| `autocuda.py`          | One‑liner `auto_cuda()` for automatic GPU selection / CPU fallback. Omitted.                                                      |
| `train_detector.py`    | Replay‑based continual training for standalone detector with memory buffer and dual metrics (ACC, F).         |
| `train_classifier.py`  | Vanilla fine‑tuning script for task classifier.                                                                                   |
| `run_defense.py`       | Full *attack → detect → repair → re‑classify* pipeline integrating TextAttack + optional LLM paraphrase (`PDLLM`).                |
| `MetricVisualizer`     | Lightweight tracker exporting `.txt` / `.json` summaries. Omitted.                                                                        |

All scripts are **single‑GPU friendly** and tested on NVIDIA RTX 3090.

---



---

## 📂 Repository Layout (core)

```
clad-core/
├─ load_text_dataset.py
├─ model.py
├─ train_detector.py
├─ train.py
├─ train_classifier.py
├─ run_defense.py
├─ utils/
│  └─ {TestAttack}
├─ detectors{MS}/
│  └─ … # saved ckpts & records.json
├─ classifiers/
│  └─ … # saved ckpts & tokenizers
└─ datasets/
   └─ SST2/ AGNEWS/ AMAZON/ YAHOO/ …
```

---


---

## 📈 Reproducibility

* All random seeds default to `42` (set `--seed` to override).
* Hyper‑parameters match the paper: `lr=2e‑5`, `batch_size=32`, `max_len=128`.
* The metrics dictionary (`defense_metrics_ms100_llm.json`) stores aggregated results for each *dataset‑attack* pair.

---



---

## 🪪 License

Released under the **MIT License** – see `LICENSE` for details.


## 🏁 Acknowledgements

This repo builds upon [TextAttack](https://github.com/QData/TextAttack) for attack generation and 🤗 Transformers for model back‑bones.
