Revisiting Automated Topic Model Evaluation with Large Language Models

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX
Submission Type: Regular Short Paper
Submission Track: Theme Track: Large Language Models and the Future of NLP
Submission Track 2: Information Retrieval and Text Mining
Keywords: topic model evaluation, interpretability, large language models, text clustering
TL;DR: Automatically evaluating topic model output has been a longstanding challenge; we find that large language models turn out to be very good at this task.
Abstract: Topic models help us make sense of large text collections. Automatically evaluating their output and determining the optimal number of topics are both longstanding challenges, with no effective automated solutions to date. This paper proposes using large language models (LLMs) for these tasks. We find that LLMs appropriately assess the resulting topics, correlating more strongly with human judgments than existing automated metrics. However, the setup of the evaluation task is crucial — LLMs perform better on coherence ratings of word sets than on intrustion detection. We find that LLMs can also assist us in guiding us towards a reasonable number of topics. In actual applications, topic models are typically used to answer a research question related to a collection of texts. We can incorporate this research question in the prompt to the LLM, which helps estimating the optimal number of topics.
Submission Number: 1499
Loading