# Beyond Single-Point Judgment: Distribution Alignment for LLM-as-a-Judge

This repository contains the implementation of our framework for aligning LLM-as-a-judge with human judgment distributions, as described in our paper.

## Overview

Large Language Models (LLMs) have emerged as powerful evaluators in the LLM-as-a-Judge paradigm, offering significant efficiency and flexibility compared to human judgments. However, traditional methods primarily rely on single-point evaluations, overlooking the inherent diversity and uncertainty in human evaluations, which leads to information loss and decreased reliability.

Our approach addresses this limitation by:

1. Explicitly aligning the LLM-generated judgment distribution with empirical human distributions
2. Proposing a distributional alignment objective based on KL divergence
3. Combining with an auxiliary cross-entropy regularization to stabilize training
4. Incorporating adversarial training to enhance model robustness against distribution perturbations

## Key Features

- Distribution-aware alignment instead of single-point evaluation
- Adversarial training framework for enhanced robustness
- Support for multiple LLM backbones (Transformers-based open-source LLM)
- Multiple evaluation tasks (SNLI, MultiNLI, MTBench, SummEval)

## Installation

```bash
# Create and activate conda environment
conda env create -f environment.yml -n <environment_name>
conda activate <environment_name>
```

## Data Preparation

Place your evaluation datasets in the `dataset/` directory with the following structure:
```
dataset/
  ├── train/
  │   ├── snli_train.jsonl
  │   ├── multinli_train.jsonl
  │   ├── mtbench_train.jsonl
  │   └── summeval_train.jsonl
  └── test/
      ├── snli_test.jsonl
      ├── multinli_test.jsonl
      ├── mtbench_test.jsonl
      └── summeval_test.jsonl
```

## Model Setup

Download the required model files and place them in the `models/` directory:
```
models/
  ├── Qwen/
  │   └── Qwen2.5-7B-Instruct/
  └── Llama/
      └── Llama3.1-8B-Instruct/
```

## Training and Evaluation

The training and evaluation process is combined in a single script. Before running, please modify the following parameters in `train_and_evaluate.sh` according to your needs:

```bash
# Model paths and names
MODEL_PATH="YOUR_MODEL_PATH"
MODEL_NAME="YOUR_MODEL_NAME"

# GPU ID
GPU_ID="YOUR_GPU_ID"

# Python environment path
PYTHON_EXEC="YOUR_PYTHON_EXEC"
```

To start training and evaluation:
```bash
bash train_and_evaluate.sh
```

## Evaluate OpenAI Models

If you want to evaluate OpenAI models, please do the following:

1. In `evaluate_openai.sh`, set your Python executable path:

```bash
PYTHON_EXEC="YOUR_PYTHON_EXEC"
```

2. In `configs/LLM_configs.py`, set your OpenAI API url and API key:

```python
"openai": {
    "base_url": "YOUR_BASE_URL",
    "api_key": "YOUR_API_KEY"
},
```

Then run:

```bash
bash evaluate_openai.sh
```

## Result Analysis

To analyze the results:
```bash
python result_summary.py
python plot.py
```

After evaluation, the results will be stored in the following locations:

- Metrics: `evaluation_results/metrics/{dataset_name}/`
- Visualization plots: `evaluation_results/img/{model_name}/`

## Robustness Testing

To test the model's robustness:
```bash
python robust_test.py
```

