An Automatic Evaluation Method for Open-domain Dialogue Based on BLEURT

Shih-Hung Wu, Jia-Jun Lee

Published: 2022, Last Modified: 09 Jun 2023IRI 2022Readers: Everyone

Abstract: Automatic open-domain dialogue generation is an important topic in natural language generation research. Due to the lack of good automatic evaluation methods, it is usually evaluated manually, which makes it difficult to compare different generation models. Automatic evaluation methods of natural language used in the past often required a reference corpus. However, for open-domain dialogue generation, the reference corpus will limit the possibilities of generation. In order to evaluate the quality of a generative model stably, we study the learning to evaluation approach to the generative dialogue based on deep learning method. Experimental corpus includes conversations collected from the web and conversations generated using the GPT-2 model. We manually evaluate these dialogue as a standard, and our system uses the BLEURT-20 model to learn an automatic evaluation model. We find t he model is suitable a san automatic evaluation mechanism for dialogue generation.

0 Replies