Abstract: While research on dialogue response generation has primarily focused on generating coherent responses conditioning on textual context, the critical question of when to respond grounded on the temporal context remains underexplored.
To bridge this gap, we propose a novel task called timely dialogue response generation and introduce the TimelyChat benchmark, which evaluates the capabilities of language models to predict appropriate time intervals and generate time-conditioned responses.
Additionally, we construct a large-scale training dataset by leveraging unlabeled event knowledge from a temporal commonsense knowledge graph and employing a large language model (LLM) to synthesize 55K event-driven dialogues.
We then train TimeR, a dialogue agent designed to proactively predict time intervals and generate timely responses that align with those intervals.
Experimental results show that TimeR outperforms prompting-based LLMs and other fine-tuned baselines in both turn-level and dialogue-level evaluations.
We publicly release our data, model, and code.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: evaluation and metrics,conversational modeling,automatic creation and evaluation of language resources,NLP datasets
Contribution Types: Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 7794
Loading