
# Path-Consistency: Prefix Enhancement for Efficient Inference in LLM

To enhance the reasoning capabilities of large language models (LLMs), self-consistency has gained significant popularity by combining multiple sampling with majority voting. However, the state-of-the-art self-consistency approaches consume substantial computational resources and lead to significant additional time costs due to the multiple sampling. This prevents its full potential from being realized in scenarios where computational resources are critical.
To improve the inference efficiency, this paper introduces \textit{path-consistency}, a method that leverages the confidence of answers generated in earlier branches to identify the prefix of the most promising path. By dynamically guiding the generation of subsequent branches based on this prefix, the \textit{path-consistency} mitigates both the errors and redundancies from random or less useful sampling in self-consistency. As a result, it can significantly accelerate the inference process by reducing the number of tokens generated. Our extensive empirical evaluation shows that the \textit{path-consistency} achieves significant acceleration in inference latency ranging from $7.8\%$ to $40.5\%$, while maintaining or even improving task accuracy across different datasets, including mathematical reasoning, common sense reasoning, symbolic reasoning, and code generation.

## Table of Contents
- [Path-Consistency: Prefix Enhancement for Efficient Inference in LLM](#path-consistency-prefix-enhancement-for-efficient-inference-in-llm)
  - [Table of Contents](#table-of-contents)
  - [Features](#features)
  - [Installation](#installation)
  - [Usage](#usage)
  - [Project Structure](#project-structure)
  - [Dependencies](#dependencies)
  - [License](#license)

## Features
- **Inference with Path Consistency**: Use the `PathConsistency` class to explore multiple reasoning paths based on a given prompt, generating and integrating answers.
- **Configurable Parameters**: Easily customize the inference process with parameters like `max_branch`, `max_level`, `confidence_threshold`, and more.
- **Dataset Support**: Load datasets and perform automated inference with integrated result tracking.

## Installation

1. **Clone the repository**:
   ```bash
   cd PathConsistency
   ```

2. **Set up a Conda environment**:
   ```bash
   conda create --name pathconsistency python=3.8
   conda activate pathconsistency
   ```

3. **Install dependencies**:
   Install the required packages listed in `requirements.txt`:
   ```bash
   pip install -r requirements.txt
   ```

## Usage

To use the `PathConsistency` project, follow these steps:

1. **Prepare your dataset**:
   Place your dataset in the `datasets/` directory. The dataset should be in either `.jsonl` or `.json` format.

2. **Prepare your model**
   Download or prepare the model files, and place them in a directory. Ensure you have the following:
   - Model checkpoint files (`.pt` or `.bin`)
   - Tokenizer file (`tokenizer.model`)
  
   After loading your model, you need to implement the `CompletionModel` abstract class in `wrapper.py` to provide an inference interface. This includes creating a concrete class that inherits from `CompletionModel` and implements the `completion_function` method to interact with your model.


3. **Run the main script**:
   Use the following command to start inference:
   ```bash
   torchrun --nproc_per_node 1 eval_llama.py --dataset your_dataset_name
   ```

   You can also customize the inference process by passing different parameters:
   ```bash
   torchrun --nproc_per_node 1 eval_llama.py --dataset your_dataset_name --max_branch 20 --confidence_thres 0.8
   ```

4. **Output**:
   The results will be saved in the `outputs/` directory with the accuracy of the generated answers logged.

## Project Structure

- `eval_llama.py`: Entry point for running inference using the path-consistency.
- `requirements.txt`: List of required Python packages.
- `README.md`: Project documentation.
- `path_consistency/`: Source code of path-consistency.
- `datasets/`: Directory to store your datasets.
- `outputs/`: Directory where the results of the inference process will be stored.

## Dependencies

The project depends on the following Python libraries:
- `fire==0.6.0`: For command-line interface (CLI) generation.
- `scipy==1.14.0`: For numerical integration in confidence calculations.
- `tqdm==4.66.4`: For progress bar functionality.

These can be installed via `pip` using the provided `requirements.txt` file.


## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.
