# HuggingFace Dataset Usage

This project loads human annotations exclusively from the HuggingFace dataset `nvidia/judges-verdict-private`.

## Setup

### 1. Set HuggingFace Access Token

You need to set the environment variable `access_token_for_judges_verdict_private` with your HuggingFace token that has access to the private dataset.

```bash
export access_token_for_judges_verdict_private="your_huggingface_token_here"
```

### 2. Verify Access

Test that you can access the dataset:

```bash
python scripts/test_huggingface_loading.py
```

## Usage

### In the Gradio App (app.py)

The Gradio app now uses HuggingFace dataset by default. Simply run:

```bash
python app.py
```

### Export Leaderboards to CSV

```bash
python scripts/export_leaderboards_to_csv.py
```

### Programmatic Usage

```python
from src.leaderboard_generator import load_human_annotations, generate_leaderboard_data

# Load annotations from HuggingFace
annotations = load_human_annotations()

# Generate leaderboard using HuggingFace data
open_source_df, closed_df = generate_leaderboard_data()
```

## Dataset Information

The HuggingFace dataset contains:
- **Dataset name**: `nvidia/judges-verdict-private`
- **Split**: `train`
- **Number of examples**: 1994
- **Features**:
  - `item_name`: Unique identifier for each item
  - `dataset_name`: Source dataset name
  - `question`: The question/prompt
  - `gt_answer`: Ground truth answer
  - `gen_answer`: Generated answer to evaluate
  - `annotations`: List of human annotations with scores and justifications

## Troubleshooting

1. **Token not found error**: Make sure the environment variable `access_token_for_judges_verdict_private` is set
2. **Access denied**: Verify your HuggingFace token has access to the `nvidia/judges-verdict-private` dataset
3. **Dataset loading errors**: Check your internet connection and HuggingFace service status
