UniDE: A multi-level and low-resource framework for automatic dialogue evaluation via LLM-based data augmentation and multitask learning

Guanghui Ye, Huan Zhao, Zixing Zhang, Zhihua Jiang

Published: 2025, Last Modified: 18 Apr 2025Inf. Process. Manag. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•A unified dialogue evaluation framework with diverse-level measurement dimensions.•A small evaluator outperforms GPT-4 evaluator, highlighting low-resource efficiency.•A feasible NLP paradigm is to train small LLMs on high-quality data from large LLMs.•Extensive experimental results on popular datasets validate the superiority of UniDE.