UniDE: A multi-level and low-resource framework for automatic dialogue evaluation via LLM-based data augmentation and multitask learning

Published: 01 Jan 2025, Last Modified: 18 Apr 2025Inf. Process. Manag. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A unified dialogue evaluation framework with diverse-level measurement dimensions.•A small evaluator outperforms GPT-4 evaluator, highlighting low-resource efficiency.•A feasible NLP paradigm is to train small LLMs on high-quality data from large LLMs.•Extensive experimental results on popular datasets validate the superiority of UniDE.
Loading