# Improving Calibration of Fine-tuned Language Models via Denoising VAEs


## Set running environment
``` bash
# (Optional) create a virtual environment
conda create -n LM-Calibration python=3.8
conda activate LM-Calibration

# Install torch according your own environments.
pip install torch==1.11.0+cu113

# Install other needed package
pip install -r requirements.txt
```

To train and evaluate parameter efficient tuning methods, you need install OpenDelta Library follow the official instructions.

You may need to run `accelerate config` to config accelerate too.

## Prepare datasets
1. Download the datasets of NLU tasks from https://github.com/shreydesai/calibration and unzip to the data folder.
2. Convert the dataset to the format of huggingface datasets. We provide our preprocessing scipt in `./utils/preprocess.py` and the running script in `./scripts/preprocess_dataset.sh`.


## Experiments
* For full fine-tuning, see `./scripts/run_classification.sh` and `./scripts/run_multiple_choice.sh`.
* For parameter-efficient tuning methods, see `./scripts/run_classification_delta.sh` and `./scripts/run_multiple_choice_delta.sh`.
* For traing and evaluating DVAEs, see `./scripts/run_vae_cls.sh` and `./scripts/run_vae_mc.sh`.
* For evaluate calibration of MLM tasks, see `./scripts/run_mlm_conf.sh` and `./utils/plot_rd.py`.

