# DebUnc: Improving Large Language Model Agent Communication Via Uncertainty Metrics

## Installation
```
cd debunc
conda create --name debunc python=3.10 -y 
conda activate debunc 
pip install -e .
```

To use restricted models, log in to Hugging Face with the following command:
```
huggingface-cli login
```

## Usage
The scripts to run and evaluate on various benchmarks can be found in [src/debate/](src/debate/).

To use Llama 3 instead of Mistral 7B:
- Replace `"mistralai/Mistral-7B-Instruct-v0.2"` with `"meta-llama/Meta-Llama-3-8B-Instruct"`
- Replace `current_len = get_len(this_agent, tokenizer) + 3` with `current_len = get_len(this_agent, tokenizer) - 1` when using Attention-All

Other models are not currently supported.

To use TokenSAR instead of Mean Token Entropy, replace `ue_method = MeanTokenEntropy()` with `ue_method = TokenSAR()`

## Attention Scaling Demo
[src/models/demo.ipynb](src/models/demo.ipynb) contains a demonstration of attention scaling applied to RAG.

## Acknowledgement
The code in [src/lm-polygraph](src/lm_polygraph/) is based on the  [LM-Polygraph](https://github.com/IINemo/lm-polygraph) project, and contains implementations for various uncertainty metrics.

The `modeling_*.py` files in [src/models](src/models/) are based on the Huggingface [Transformers](https://github.com/huggingface/transformers) library, with modifications to perform attention scaling. The attention scaling occurs between `##### <AttentionScaling> #####` and `##### </AttentionScaling> #####`.