Automatic concept identification in goal-oriented conversations

Ananlada Chotimongkol, Alexander I. Rudnicky

Published: 2002, Last Modified: 25 Jan 2025INTERSPEECH 2002EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We address the problem of identifying key domain concepts automatically from an unannotated corpus of goal-oriented humanhuman conversations. We examine two clustering algorithms, one based on mutual information and another one based on Kullback- Liebler distance. In order to compare the results from both techniques quantitatively, we evaluate the outcome clusters against reference concept labels using precision and recall metrics adopted from the evaluation of topic identification task. However, since our system allows more than one cluster to associate with each concept an additional metric, a singularity score, is added to better capture cluster quality. Based on the proposed quality metrics, the results show that Kullback-Liebler-based clustering outperforms mutual informationbased clustering for both the optimal quality and the quality achieved using an automatic stopping criterion.