
First, set the corresponding paths in `_settings.py`.


## Prepare the data
Run `semantic_undertainty/parse_coqa.py` to generate the coqa data.

## Generate the responses
Use the `llama-13b-hf` or `opt-13b` for model, and `coqa`, `triviaqa` and `nq_open` for the dataset  below. (You need to download the LLaMA weight first).
`python -m pipeline.generate --model llama-13b-hf --dataset coqa`

Use `pipeline.generate_bb` for `gpt-3.5-turbo` experiments (update `keys.json` with your API keys first).

Update `GEN_PATHS` in `_settings.py` for next steps.

## Run UQ experiments
You can run `dataeval/load.py` to cache down results first.
Then, please refer to `notebook/dmeo.ipynb` for an example.
