RecLLMSim: A Comprehensive Task-based Recommendation Conversation Dataset Generated by Large Language Models
Abstract: Conversational systems have garnered significant attention and importance in recent years.
However, collecting conversational datasets has traditionally been a time-consuming and labor-intensive process.
With the advent of large language models (LLMs), there is a growing interest in using them to generate synthetic datasets.
Given LLMs' strong role-playing capabilities, they hold the potential to simulate users effectively.
This capability allows for the automated generation of conversations, with LLMs acting as both users and assistants across various scenarios.
Our study proposes a framework designed to generate task-based recommendation conversation datasets across multiple scenarios with LLMs.
We have created a comprehensive conversational dataset using this framework, and the dataset is named RecLLMSim.
We conducted extensive experiments to measure the quality of the user simulator and the assistant, and annotated user intent and hallucinations to improve its usability.
Experimental results demonstrate that using LLMs as user simulators is a promising approach.
Besides, the generated RecLLMSim dataset can be adapted for various tasks such as user profiling and simulation, offering a rich resource for further advancements in conversational systems.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Resources and Evaluation, NLP Applications, Dialogue and Interactive Systems, Information Retrieval and Text Mining
Contribution Types: Data resources, Data analysis
Languages Studied: English, Chinese
Submission Number: 2883
Loading