## Diamante: Towards Boosting the Open-Domain Chatbot with Human Feedback

Diamante is a novel and efficient framework consisting of a data collection strategy and a learning method to boost the performance of pre-trained dialogue models. Two kinds of human feedback are collected and leveraged in Diamante, including explicit demonstration and implicit preference. 

The Diamante dataset is now publicly available on a data platform. Considering the conference anonymous requirement and data license constraint, we sample 100 cases from the train, valid, and test sets for demonstration in this folder. 

We have included the Diamante source code in this folder. To explain Diamante's usage, we take the training and inference of a small dialogue model [CDial-GPT](https://arxiv.org/pdf/2008.03946.pdf) as an example.


### Dataset

The sampled Diamante dataset is in the folder ***data/raw_data/***. The official Diamante dataset consists of 6,838 dialogues with 98,115 utterances. The dataset is split into train, validation, and test sets.

| Json Key Name       | Description                                 |
| ------------------- | ------------------------------------------- |
| id                  | dialogue id                                 |
| conversation        | the whole dialogue                          |
| role                | current speaker                             |
| utterance           | current utterance                           |
| response_candidates | candidate responses for the current context |

Statistics of the Diamante dataset:

| Diamante                  |      Train      |      Valid      |      Test       |      Total      |
| :------------------------ | :-------------: | :-------------: | :-------------: | :-------------: |
| Number of Dialogues       |      5,838      |       500       |       500       |      6,838      |
| Number of Utterances      |     83,765      |      7,166      |      7,184      |     98,115      |
| Average Utterance Length  |      14.26      |      14.20      |      14.29      |      14.25      |
| Select / Revise / Rewrite | 18% / 41% / 41% | 19% / 40% / 41% | 19% / 40% / 41% | 18% / 41% / 41% |



### Quick Start

#### Requirements and Installation

- python version >= 3.7
- paddlepaddle-gpu version >= 2.0.0
  - You can install PaddlePaddle following [the instructions](https://www.paddlepaddle.org.cn/documentation/docs/en/install/index_en.html).
  - The specific version of PaddlePaddle is also based on your [CUDA version](https://developer.nvidia.com/cuda-downloads) (recommended version: 10.1) and [CuDNN version](https://developer.nvidia.com/rdp/cudnn-download) (recommended version: 7.6). See more information on [PaddlePaddle document about GPU support](https://www.paddlepaddle.org.cn/documentation/docs/en/install/index_en.html#paddlepaddle-s-support-for-gpu)
- sentencepiece
- termcolor
- If you want to run distributed training, you'll also need [NCCL](https://developer.nvidia.com/nccl/nccl-download)


#### Data Preparation

```python
python data/build_eval_data.py
```


#### Model Preparation

Download the dialogue model [CDial-GPT](https://huggingface.co/thu-coai/CDial-GPT2_LCCC-base) and convert it to PaddlePaddle parameters. 

```bash
wget https://huggingface.co/thu-coai/CDial-GPT2_LCCC-base/resolve/main/pytorch_model.bin
```

```python
python knover/tools/gpt2_convert.py --param_path path/to/pytorch_model.bin --save_path cdial-gpt2_conf/cdial-gpt2/params/
```


#### Training

Apply Diamante's dataset and joint training paradigm to CDial-GPT
```bash
./scripts/local/job.sh cdial-gpt2_conf/train.conf
```

The training configurations can be modified in the ***cdial-gpt2_conf/train.conf***.

The default configuration will save the best and latest checkpoints to the ***output/*** folder.


#### Interacting with the model

```bash
# Decompress the saved model
tar -xvf output/best.tar

# Interact with the model
./scripts/local/job.sh cdial-gpt2_conf/interact.conf
```


#### Model self-chat

Here, we provide some self-chat topics in ***data/self-chat_topics.sample.txt***.

You can modify the ***self-chat.conf*** to conduct self-chat with CDial-GPT or CDial-GPT (Diamante). The script uses CDial-GPT (Diamante) by default.

```bash
# Decompress the saved model
tar -xvf output/best.tar

# Self-chat using the model
./scripts/local/job.sh cdial-gpt2_conf/self_chat.conf
```

In our experiments, applying Diamante to CDial-GPT brings remarkable improvements across all the evaluation metrics.

|                      | Coherence | Informativeness | Safety    | Engagingness |
| :------------------- | :-------: | :-------------: | --------- | :----------: |
| CDial-GPT            |   0.484   |      0.400      | 0.660     |    0.140     |
| CDial-GPT (Diamante) | **0.968** |    **0.960**    | **1.368** |  **0.480**   |
