# PRISM - A Multi-Dimensional Verification Approach to Mitigate Hallucinations in Chain-of-Thought Reasoning

## 🧠 Overview

With the rapid advancement of Large Language Models (LLMs), Chain-of-Thought (CoT) prompting has been extensively employed in tasks such as mathematical reasoning and semantic inference. However, existing CoT verification approaches face challenges in striking an optimal balance among reasoning efficiency, the quality of intermediate steps, and final answer accuracy.

To address this issue, we propose a unified framework—**PRISM** (Progressive Reasoning with Instructional and Strategic Multi-perspective Validation), consisting of two main stages:

- **Stage 1: Commonsense-Augmented Progressive Instructional Reasoning (CPIR)**  
  This module structurally models the reasoning trajectory to alleviate both conditional and commonsense hallucinations.

- **Stage 2: Multi-Dimensional Heterogeneous Collaborative Verification (MHCV)**  
  This stage verifies the reasoning process from bidirectional and multi-perspective views to mitigate diverse types of hallucinations. We also introduce a **Discard-Weighted Voting** mechanism to improve the robustness of final answer selection compared to traditional majority voting.

Experimental results demonstrate that PRISM significantly improves final answer accuracy across various benchmarks, while maintaining coherence and reliability in the reasoning steps.

---

## ⚙️ Setup Instructions

### 1. Create and Activate a Virtual Environment

We recommend using [conda](https://docs.conda.io/) to manage dependencies.

```bash
conda create -n PRISM python=3.11
conda activate PRISM
```

### 2. Install Dependencies

Make sure all required packages are installed:

```bash
pip install -r requirements.txt
```

## 🚀 **Running PRISM**

### Step 1: Generate Candidate Reasoning Chains (CPIR)

This step produces candidate answers via commonsense-augmented instructional reasoning.

```bash
python generate_candidate_chains.py
```

### Step 2: Configure and Run Verification (MHCV)

Modify the script arguments to match your model name, dataset name, and input/output paths.  
Then run the following command:

```bash
python prism_verification.py \
  --model-name deepseek-v3 \
  --data-name gsm8k \
  --input-result ./results/new_input.json \
  --output-result ./results/new_output.json \
  --task-name Forward_Backward_Verification \
  --n 1
```

### Parameters:

- `--model-name` : Name of the LLM used (e.g., `deepseek-v3`)
- `--data-name` : Dataset to evaluate on (e.g., `gsm8k`)
- `--input-result` : Path to the candidate answers JSON file
- `--output-result` : Path to save the verification result
- `--task-name` : The verification strategy
- `--n` : Number of verification rounds


### Step 3: Compute Accuracy

After verification, calculate the overall accuracy using:

```bash
python acc_count_verified.py
```

### Step 4: Evaluate Computational Cost & Verification Accuracy

This step computes the computational overhead and evaluates verification accuracy in the backward verification experiment.

```bash
python run_backward_experiment.py
```

The results are saved in the folder "backward_results".


## 📁 Directory Structure

Below is the folder layout of the project:

```plaintext
PRISM/
├── .idea/
├── data/
│   ├── dataset/
│   ├── result/
│   ├── verification_test_50/
│   └── weight_change/
├── deepseek_results/
│   ├── addsub/
│   ├── aqua_rat/
│   ├── date_understanding/
│   ├── gsm8k/
│   └── last_letter/
├── utils/
│   ├── basic.py
│   ├── cot_voting.py
│   ├── deepseek_api_utils.py
│   ├── few_shot_prompts.py
│   ├── few_shot_prompts_addsub.py
│   ├── few_shot_prompts_aqua.py
│   ├── few_shot_prompts_du.py
│   ├── few_shot_prompts_ll.py
├── acc_count_verified.py
├── cot.py
├── extract_final_answer.py
├── generate_candidate_chains.py
├── merge_dataset.py
├── output_preprocessing.py
├── prism_verification.py
├── run_backward_experiment.py
├── README.md
└── requirements.txt
```

### 📌 Notes

- Ensure API access or model checkpoints are correctly configured if you're using hosted or local models.
- Adjust prompt templates or model inference parameters in the `utils/` directory if needed.
- PRISM supports flexible plug-and-play of various CoT generation or verification strategies.

