# Token-Guard

Token-Guard is a **token-level hallucination control framework** for Large Language Models (LLMs). It dynamically detects and corrects hallucinated tokens during generation, improving factual and logical consistency without large-scale fine-tuning or heavy retrieval.

---

## Table of Contents

- [Features](#features)  
- [Installation](#installation)  
- [Datasets](#datasets)  
- [Quick Start](#quick-start)  
- [Evaluation](#evaluation)  

---

## Features

- **Token-level hallucination control** with internal self-checking.  
- **Iterative pruning and regeneration** to dynamically correct errors.  
- **Latent-space hallucination scoring** to quantitatively evaluate token reliability.  
- Compatible with multiple LLMs for knowledge-intensive tasks.  
- Provides per-token logging for debugging and analysis.  
- Lightweight, decoding-based approach that avoids resource-heavy fine-tuning.  

---

## Installation

```bash
# 1. Create and activate virtual environment
conda create -n tokenguard python=3.10
conda activate tokenguard

# 2. Install GPU PyTorch (adjust CUDA version if needed)
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124

# 3. Install additional dependencies
pip install -r requirements.txt
```
---

## Datasets
> We conduct experiments on six datasets: FinanceBench, DROP, COVID-QA, PubMedQA, HaluEval, and RAGTruth. You can download them from [TeraBox], and set the data path in `data/`.


---

## Quick Start
```bash
bash run.sh
```
---

## Evaluation
```bash
cd eval/process
python process.py
cd ..
python eval.py
```
---