

# Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models

## For the review process of ICLR 2024 ONLY. PLEASE DO NOT DISTRIBUTE.


### Find label errors:

Please replace `YOUR_PATH` with your path.


```bash
bash run_jigsaw.sh
bash run_pku.sh
bash run_anthropic.sh
```

### Prepare data for downstream tasks
```bash
bash preppare_data_for_training.sh
```