Keywords: Telephone-based customer service, Foundation models, AI contact center, Automatic Evaluation Metric
Abstract: There are several essential requirements for high-quality Contact Centers (CCs). Interalia, correct understanding, courteous interaction, and accurate information provision are crucial. Recently, the advent of foundation models with high generalization performance has brought expectations of potential utilization in CCs applications. Therefore, we explore the feasibility of the foundation models for AI Contact Centers (AICCs). For this purpose, (1) we propose a new dataset for customer service conversations focused on government services in Korea's capital, crafted by experts who work in this service field. (2) We combine audio and text based foundation models to construct the AICC framework. We generate responses about transcribed text from audio with Large Language Models (LLMs) provided prior information to provide factual answers. (3) We evaluate the validity of LLMs answers generated by human evaluators as agent answers. Furthermore, we propose an automatic evaluation method based on LLMs called a generative model-based hierarchical dialog evaluation metric and compare it with the results of human evaluators to further investigate the feasibility of using a foundation model-based evaluation method.
Submission Number: 4
Loading