# AIR: Component-Wise Analysis Framework for Preference Datasets
Official implementation for the paper *"AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset"*

## Repository Structure
```bash
.
├── README.md                      # Project overview
├── environment.yml                # Virtual Environment Configuration
├── scripts/                       # Automation scripts
│   ├── generate.sh                # Generate response using LLM list
│   └── train_dpo.sh               # DPO training script
└── src/                           # Core components
    ├── annotate.py                # Annotation strategies
    ├── customize.py               # Generate fine-grained preference questions
    ├── generate.py                # Generate response using LLM list
    ├── instruction_instag.py      # Construct preference dataset for InsTag instruction filtering
    ├── instruction_variance.py    # Construct preference dataset for variance-based instruction selection
    ├── response_absolute_score.py # Construct preference dataset for absolute score analysis
    ├── response_onoff_mix.py      # Construct preference dataset for on/off-policy mixing
    └── response_score_margin.py   # Construct preference dataset for score margin experiments
```

## Key Features

- **Component-Wise Optimization**: Isolate and optimize:
  - 📜 **Instructions**: Variance-based selection (`instruction_variance.py`)
  - 🤖 **Responses**: Score margins + on/off-policy mixing (`response_*.py`)
  - 🏷️ **Annotations**: Generative vs classifier-based scoring (`annotate.py`)

## Quick Start

```shell
# Create virtual environment
conda env create -f environment.yml

# Generate responses
bash scripts/generate.sh

# Construct preference dataset
python src/response_absolute_score.py

# Train DPO model
bash scripts/train_dpo.sh [DATASET_NAME] [DATE] [STRATEGY]
```

